2022-11-23T01:39:17.9387536Z Requested labels: linux.16xlarge.nvidia.gpu 2022-11-23T01:39:17.9387613Z Job defined at: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/master 2022-11-23T01:39:17.9387636Z Waiting for a runner to pick up this job... 2022-11-23T01:39:18.1227418Z Job is about to start running on the runner: i-08a957f819e89e94d (organization) 2022-11-23T01:39:22.8024171Z Current runner version: '2.299.1' 2022-11-23T01:39:22.8031776Z Runner name: 'i-08a957f819e89e94d' 2022-11-23T01:39:22.8032555Z Runner group name: 'Default' 2022-11-23T01:39:22.8033452Z Machine name: 'ip-10-0-8-67' 2022-11-23T01:39:22.8036057Z ##[group]GITHUB_TOKEN Permissions 2022-11-23T01:39:22.8037107Z Actions: write 2022-11-23T01:39:22.8037438Z Checks: write 2022-11-23T01:39:22.8038009Z Contents: write 2022-11-23T01:39:22.8038456Z Deployments: write 2022-11-23T01:39:22.8038804Z Discussions: write 2022-11-23T01:39:22.8039180Z Issues: write 2022-11-23T01:39:22.8039555Z Metadata: read 2022-11-23T01:39:22.8039893Z Packages: write 2022-11-23T01:39:22.8040309Z Pages: write 2022-11-23T01:39:22.8040703Z PullRequests: write 2022-11-23T01:39:22.8041258Z RepositoryProjects: write 2022-11-23T01:39:22.8041715Z SecurityEvents: write 2022-11-23T01:39:22.8042101Z Statuses: write 2022-11-23T01:39:22.8042456Z ##[endgroup] 2022-11-23T01:39:22.8047011Z Secret source: Actions 2022-11-23T01:39:22.8047768Z Prepare workflow directory 2022-11-23T01:39:22.9349392Z Prepare all required actions 2022-11-23T01:39:22.9570336Z Getting action download info 2022-11-23T01:39:23.1843163Z Download action repository 'pytorch/test-infra@main' (SHA:c57ff4d9a93667a5571a80a0e92c3e2674aeedfd) 2022-11-23T01:39:23.4869256Z Download action repository 'pytorch/pytorch@master' (SHA:1cfd3858ac54fe3883534309081631a0a892ba3f) 2022-11-23T01:39:27.0502391Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2022-11-23T01:39:27.3686352Z Getting action download info 2022-11-23T01:39:27.5212437Z Download action repository 'malfet/checkout@silent-checkout' (SHA:c7b8fef48edfe1bca0044a44b1f7f7c4318a3076) 2022-11-23T01:39:27.7587378Z Getting action download info 2022-11-23T01:39:27.9342123Z Download action repository 'nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767' (SHA:7d4a37704547a311dbb66ebdf5b23ec19374a767) 2022-11-23T01:39:28.0832981Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml 2022-11-23T01:39:28.0835560Z ##[group] Inputs 2022-11-23T01:39:28.0835928Z build-environment: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T01:39:28.0836392Z test-matrix: { include: [ { config: "multigpu", shard: 1, num_shards: 1, runner: "linux.16xlarge.nvidia.gpu" }, ]} 2022-11-23T01:39:28.0837001Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:39:28.0837485Z sync-tag: 2022-11-23T01:39:28.0838492Z timeout-minutes: 240 2022-11-23T01:39:28.0838778Z ##[endgroup] 2022-11-23T01:39:28.0839595Z Complete job name: linux-bionic-cuda11.6-py3.9-gcc7 / test (multigpu, 1, 1, linux.16xlarge.nvidia.gpu, mem_leak_check) 2022-11-23T01:39:28.2151523Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2022-11-23T01:39:28.2151924Z with: 2022-11-23T01:39:28.2152477Z github-secret: *** 2022-11-23T01:39:28.2152789Z activate-with-label: false 2022-11-23T01:39:28.2153064Z label: with-ssh 2022-11-23T01:39:28.2153327Z remove-existing-keys: true 2022-11-23T01:39:28.2153593Z env: 2022-11-23T01:39:28.2153843Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:39:28.2154247Z ##[endgroup] 2022-11-23T01:39:28.3250290Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2022-11-23T01:39:28.3497641Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@master 2022-11-23T01:39:28.3498000Z with: 2022-11-23T01:39:28.3498249Z submodules: recursive 2022-11-23T01:39:28.3498534Z fetch-depth: 0 2022-11-23T01:39:28.3498796Z env: 2022-11-23T01:39:28.3499045Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:39:28.3499310Z ##[endgroup] 2022-11-23T01:39:28.3794880Z ##[group]Run retry () { 2022-11-23T01:39:28.3795223Z retry () { 2022-11-23T01:39:28.3795543Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2022-11-23T01:39:28.3795845Z } 2022-11-23T01:39:28.3796087Z echo "${GITHUB_WORKSPACE}" 2022-11-23T01:39:28.3796573Z if [ -z "${NO_SUDO}" ]; then 2022-11-23T01:39:28.3796890Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:39:28.3797154Z else 2022-11-23T01:39:28.3797438Z  retry rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:39:28.3797715Z fi 2022-11-23T01:39:28.3798005Z mkdir "${GITHUB_WORKSPACE}" 2022-11-23T01:39:28.3816433Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:39:28.3816744Z env: 2022-11-23T01:39:28.3816998Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:39:28.3817263Z NO_SUDO: 2022-11-23T01:39:28.3817484Z ##[endgroup] 2022-11-23T01:39:28.3945762Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:39:28.4665030Z ##[group]Run malfet/checkout@silent-checkout 2022-11-23T01:39:28.4665344Z with: 2022-11-23T01:39:28.4665603Z ref: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:39:28.4665892Z fetch-depth: 0 2022-11-23T01:39:28.4666154Z submodules: recursive 2022-11-23T01:39:28.4666413Z quiet-checkout: true 2022-11-23T01:39:28.4666692Z repository: pytorch/pytorch 2022-11-23T01:39:28.4667154Z token: *** 2022-11-23T01:39:28.4667595Z ssh-strict: true 2022-11-23T01:39:28.4667858Z persist-credentials: true 2022-11-23T01:39:28.4668131Z clean: true 2022-11-23T01:39:28.4668380Z lfs: false 2022-11-23T01:39:28.4668782Z set-safe-directory: true 2022-11-23T01:39:28.4669032Z env: 2022-11-23T01:39:28.4669265Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:39:28.4669499Z ##[endgroup] 2022-11-23T01:39:28.6196619Z Syncing repository: pytorch/pytorch 2022-11-23T01:39:28.6198454Z ##[group]Getting Git version info 2022-11-23T01:39:28.6199015Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:39:28.6199622Z [command]/usr/bin/git version 2022-11-23T01:39:28.6199878Z git version 2.37.1 2022-11-23T01:39:28.6207968Z ##[endgroup] 2022-11-23T01:39:28.6229135Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/bc069309-6877-49fd-8122-edc00b9e63c9' before making global git config changes 2022-11-23T01:39:28.6230383Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T01:39:28.6235685Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:39:28.6282499Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:39:28.6290042Z ##[group]Initializing the repository 2022-11-23T01:39:28.6292460Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:39:28.6331620Z hint: Using 'master' as the name for the initial branch. This default branch name 2022-11-23T01:39:28.6333196Z hint: is subject to change. To configure the initial branch name to use in all 2022-11-23T01:39:28.6334361Z hint: of your new repositories, which will suppress this warning, call: 2022-11-23T01:39:28.6334694Z hint: 2022-11-23T01:39:28.6335174Z hint: git config --global init.defaultBranch 2022-11-23T01:39:28.6335436Z hint: 2022-11-23T01:39:28.6335826Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2022-11-23T01:39:28.6336350Z hint: 'development'. The just-created branch can be renamed via this command: 2022-11-23T01:39:28.6336685Z hint: 2022-11-23T01:39:28.6337150Z hint: git branch -m 2022-11-23T01:39:28.6337703Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2022-11-23T01:39:28.6347171Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2022-11-23T01:39:28.6385501Z ##[endgroup] 2022-11-23T01:39:28.6386152Z ##[group]Disabling automatic garbage collection 2022-11-23T01:39:28.6391428Z [command]/usr/bin/git config --local gc.auto 0 2022-11-23T01:39:28.6424666Z ##[endgroup] 2022-11-23T01:39:28.6425523Z ##[group]Setting up auth 2022-11-23T01:39:28.6435813Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T01:39:28.6471982Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T01:39:28.6868679Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T01:39:28.6900897Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T01:39:28.7245392Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:39:28.7306635Z ##[endgroup] 2022-11-23T01:39:28.7307168Z ##[group]Fetching the repository 2022-11-23T01:39:28.7317094Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --quiet --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2022-11-23T01:40:21.0354907Z [command]/usr/bin/git rev-parse --verify --quiet 1cfd3858ac54fe3883534309081631a0a892ba3f^{object} 2022-11-23T01:40:21.0390096Z 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:40:21.0394122Z ##[endgroup] 2022-11-23T01:40:21.0394705Z ##[group]Determining the checkout info 2022-11-23T01:40:21.0396034Z ##[endgroup] 2022-11-23T01:40:21.0396486Z ##[group]Checking out the ref 2022-11-23T01:40:21.0402839Z [command]/usr/bin/git checkout --quiet --force 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:40:22.8045007Z ##[endgroup] 2022-11-23T01:40:22.8045580Z ##[group]Setting up auth for fetching submodules 2022-11-23T01:40:22.8052719Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:40:22.8119635Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2022-11-23T01:40:22.8153022Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2022-11-23T01:40:22.8187084Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2022-11-23T01:40:22.8219839Z ##[endgroup] 2022-11-23T01:40:22.8220830Z ##[group]Fetching submodules 2022-11-23T01:40:22.8223883Z [command]/usr/bin/git submodule sync --recursive 2022-11-23T01:40:22.8594145Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2022-11-23T01:40:22.8945893Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2022-11-23T01:40:22.8947176Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2022-11-23T01:40:22.8950758Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2022-11-23T01:40:22.8953710Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2022-11-23T01:40:22.8956840Z Submodule 'third_party/QNNPACK' (https://github.com/pytorch/QNNPACK) registered for path 'third_party/QNNPACK' 2022-11-23T01:40:22.8962134Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2022-11-23T01:40:22.8964038Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2022-11-23T01:40:22.8969735Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2022-11-23T01:40:22.8973288Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2022-11-23T01:40:22.8977322Z Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub' 2022-11-23T01:40:22.8982463Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2022-11-23T01:40:22.8986130Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2022-11-23T01:40:22.8991962Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2022-11-23T01:40:22.8995389Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2022-11-23T01:40:22.9001419Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2022-11-23T01:40:22.9004972Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2022-11-23T01:40:22.9010268Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2022-11-23T01:40:22.9014839Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:40:22.9020420Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2022-11-23T01:40:22.9026434Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2022-11-23T01:40:22.9031932Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2022-11-23T01:40:22.9037527Z Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake' 2022-11-23T01:40:22.9043557Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2022-11-23T01:40:22.9048948Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2022-11-23T01:40:22.9055018Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2022-11-23T01:40:22.9061156Z Submodule 'third_party/neon2sse' (https://github.com/intel/ARM_NEON_2_x86_SSE.git) registered for path 'third_party/neon2sse' 2022-11-23T01:40:22.9067984Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2022-11-23T01:40:22.9074851Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2022-11-23T01:40:22.9081722Z Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt' 2022-11-23T01:40:22.9088150Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2022-11-23T01:40:22.9094634Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2022-11-23T01:40:22.9101136Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2022-11-23T01:40:22.9109251Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2022-11-23T01:40:22.9116357Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2022-11-23T01:40:22.9123504Z Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum' 2022-11-23T01:40:22.9130577Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2022-11-23T01:40:22.9139140Z Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six' 2022-11-23T01:40:22.9145963Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2022-11-23T01:40:22.9154168Z Submodule 'third_party/tbb' (https://github.com/01org/tbb) registered for path 'third_party/tbb' 2022-11-23T01:40:22.9162534Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2022-11-23T01:40:22.9170613Z Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd' 2022-11-23T01:40:22.9208360Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2022-11-23T01:40:23.2025547Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2022-11-23T01:40:23.4258008Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2022-11-23T01:40:23.6283174Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2022-11-23T01:40:23.9201466Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/QNNPACK'... 2022-11-23T01:40:24.2083361Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2022-11-23T01:40:26.2025527Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2022-11-23T01:40:32.1592801Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2022-11-23T01:40:32.6399261Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2022-11-23T01:40:33.2291316Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cub'... 2022-11-23T01:40:34.8239402Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2022-11-23T01:40:36.1663157Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2022-11-23T01:40:37.9824467Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2022-11-23T01:40:46.1241296Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2022-11-23T01:40:46.8808149Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2022-11-23T01:40:48.3821024Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2022-11-23T01:40:49.4825308Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2022-11-23T01:40:49.6832295Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2022-11-23T01:40:50.1690178Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2022-11-23T01:40:50.5432548Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2022-11-23T01:40:51.5253136Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2022-11-23T01:40:51.9411518Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ios-cmake'... 2022-11-23T01:40:52.1364518Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2022-11-23T01:40:52.3812002Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2022-11-23T01:40:54.1837054Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2022-11-23T01:40:55.8986474Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/neon2sse'... 2022-11-23T01:40:56.3888450Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2022-11-23T01:41:02.6985626Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2022-11-23T01:41:04.4721959Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt'... 2022-11-23T01:41:04.9323882Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2022-11-23T01:41:05.1672423Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2022-11-23T01:41:11.4242915Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2022-11-23T01:41:11.6315861Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2022-11-23T01:41:11.9001533Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2022-11-23T01:41:12.7860320Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-enum'... 2022-11-23T01:41:13.0196443Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2022-11-23T01:41:13.3458120Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-six'... 2022-11-23T01:41:13.6999330Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2022-11-23T01:41:14.2857467Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tbb'... 2022-11-23T01:41:16.9316745Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2022-11-23T01:41:17.4422260Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/zstd'... 2022-11-23T01:41:19.7672435Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2022-11-23T01:41:19.7809867Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2022-11-23T01:41:19.7922177Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2022-11-23T01:41:19.8231351Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2022-11-23T01:41:19.8525241Z Submodule path 'third_party/QNNPACK': checked out '7d2a4e9931a82adc3814275b6219a03e24e36b4c' 2022-11-23T01:41:19.8974055Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2022-11-23T01:41:20.7258801Z Submodule path 'third_party/XNNPACK': checked out 'ae108ef49aa5623b896fc93d4298c49d1750d9ba' 2022-11-23T01:41:20.7541726Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:41:20.8830343Z Submodule path 'third_party/cpuinfo': checked out '8ec7bd91ad0470e61cf38f618cc1f270dede599c' 2022-11-23T01:41:20.9267277Z Submodule path 'third_party/cub': checked out 'd106ddb991a56c3df1b6d51b2409e36ba8181ce4' 2022-11-23T01:41:21.3004714Z Submodule path 'third_party/cudnn_frontend': checked out '171a7a986f7fbd9ed71bd0cf3c7ad4f55843d6b3' 2022-11-23T01:41:21.8367094Z Submodule path 'third_party/cutlass': checked out 'b72cbf957df8cf84a6d0ff91c190ad51a9c1d24a' 2022-11-23T01:41:22.1465648Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2022-11-23T01:41:22.2051262Z Submodule path 'third_party/fbgemm': checked out '4d1738b3142a6cb0c032cd639e239566010b054a' 2022-11-23T01:41:22.2072638Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:41:22.2074530Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:41:22.2078092Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:41:22.2081102Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:41:22.2115640Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2022-11-23T01:41:23.1569620Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2022-11-23T01:41:23.7001808Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2022-11-23T01:41:24.6988737Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2022-11-23T01:41:25.0164922Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2022-11-23T01:41:25.1447889Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2022-11-23T01:41:25.2195633Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2022-11-23T01:41:25.2324069Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '1840658c184f3eeba787dae0f06c45756c1daaf5' 2022-11-23T01:41:25.3528401Z Submodule path 'third_party/flatbuffers': checked out 'd0cede9c90c5257537c293517a21376408b549fa' 2022-11-23T01:41:25.3969544Z Submodule path 'third_party/fmt': checked out '7bdf0628b1276379886c7f6dda2cef2b3b374f0b' 2022-11-23T01:41:25.4084934Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2022-11-23T01:41:25.4575169Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2022-11-23T01:41:25.4876329Z Submodule path 'third_party/gloo': checked out '4a5e339b764261d20fc409071dc7a8b8989aa195' 2022-11-23T01:41:25.5455765Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2022-11-23T01:41:25.5603760Z Submodule path 'third_party/ideep': checked out '5ddc65efe0428bbce2942b3ce5e3ce15239abe2f' 2022-11-23T01:41:25.5622414Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2022-11-23T01:41:25.5652694Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2022-11-23T01:41:34.6275701Z Submodule path 'third_party/ideep/mkl-dnn': checked out 'd19d0f795c60695bd32f894c6f01771b2dfbe24d' 2022-11-23T01:41:34.6299107Z Submodule 'third_party/oneDNN' (https://github.com/oneapi-src/oneDNN.git) registered for path 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:41:34.6330849Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN'... 2022-11-23T01:41:43.9196242Z Submodule path 'third_party/ideep/mkl-dnn/third_party/oneDNN': checked out '650085b2f3643aad05c629425983491d63b5c289' 2022-11-23T01:41:43.9335189Z Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724' 2022-11-23T01:41:43.9521256Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2022-11-23T01:41:44.0666787Z Submodule path 'third_party/kineto': checked out '6c1629809068efd78a8d56b4aa479c7ec49ae562' 2022-11-23T01:41:44.0687687Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:41:44.0689099Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:41:44.0723528Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2022-11-23T01:41:45.2293171Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2022-11-23T01:41:46.3158901Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '2591ab91c3898c9f6544fff04660276537d32ffd' 2022-11-23T01:41:46.3850084Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2022-11-23T01:41:46.4112514Z Submodule path 'third_party/nccl/nccl': checked out 'f89fd4777d2ef9229c039ff750ae21da01626f52' 2022-11-23T01:41:46.4284081Z Submodule path 'third_party/neon2sse': checked out '97a126f08ce318023be604d03f88bf0820a9464a' 2022-11-23T01:41:46.5665991Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2022-11-23T01:41:46.9284023Z Submodule path 'third_party/onnx': checked out 'f7ee1ac60d06abe8e26c9b6bbe1e3db5286b614b' 2022-11-23T01:41:46.9318401Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2022-11-23T01:41:46.9319435Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2022-11-23T01:41:46.9349981Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2022-11-23T01:41:47.3400720Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2022-11-23T01:41:48.2156270Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:41:48.2562301Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'ffa346860b306c9bbfb341aed9c14c067751feb8' 2022-11-23T01:41:48.2760104Z Submodule path 'third_party/onnx-tensorrt': checked out 'c153211418a7c57ce071d9ce2a41f8d1c85a878f' 2022-11-23T01:41:48.2778549Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:41:48.2808199Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx'... 2022-11-23T01:41:50.0544181Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out '765f5ee823a67a866f4bd28a9860e81f3c811ce8' 2022-11-23T01:41:50.0568333Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:41:50.0569719Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:41:50.0606353Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'... 2022-11-23T01:41:50.4746986Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'... 2022-11-23T01:41:51.3531923Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508' 2022-11-23T01:41:51.4351537Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c' 2022-11-23T01:41:51.4371863Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:41:51.4403370Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'... 2022-11-23T01:41:51.6678737Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:41:51.6793227Z Submodule path 'third_party/pocketfft': checked out 'ea778e37710c07723435b1be58235996d1d43a5a' 2022-11-23T01:41:52.0077274Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2022-11-23T01:41:52.0100533Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:41:52.0102419Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2022-11-23T01:41:52.0133226Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2022-11-23T01:41:53.2849506Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2022-11-23T01:41:54.2963152Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2022-11-23T01:41:54.3825515Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2022-11-23T01:41:54.3937435Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2022-11-23T01:41:54.4075592Z Submodule path 'third_party/pthreadpool': checked out 'a134dd5d4cee80cce15db81a72e7f929d71dd413' 2022-11-23T01:41:54.4493332Z Submodule path 'third_party/pybind11': checked out '80dc998efced8ceb2be59756668a7e90e8bef917' 2022-11-23T01:41:54.4607856Z Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7' 2022-11-23T01:41:54.4958391Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2022-11-23T01:41:54.5076808Z Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a' 2022-11-23T01:41:54.5636999Z Submodule path 'third_party/sleef': checked out 'e0a003ee838b75d11763aa9c3ef17bf71a725bff' 2022-11-23T01:41:54.7074479Z Submodule path 'third_party/tbb': checked out 'a51a90bc609bb73db8ea13841b5cf7aa4344d4a9' 2022-11-23T01:41:54.7408855Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2022-11-23T01:41:54.7430528Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:41:54.7431979Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:41:54.7435895Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:41:54.7440095Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:41:54.7472659Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2022-11-23T01:41:55.7261483Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2022-11-23T01:41:56.0226670Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2022-11-23T01:41:57.3563904Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2022-11-23T01:41:58.2884641Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2022-11-23T01:41:58.3073211Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2022-11-23T01:41:58.3905049Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2022-11-23T01:41:58.4250179Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2022-11-23T01:41:58.4269498Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:41:58.4298175Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2022-11-23T01:41:58.6511521Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:41:58.8171255Z Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8' 2022-11-23T01:41:58.8220266Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2022-11-23T01:41:58.8573394Z Entering 'android/libs/fbjni' 2022-11-23T01:41:58.8620911Z Entering 'third_party/FP16' 2022-11-23T01:41:58.8669814Z Entering 'third_party/FXdiv' 2022-11-23T01:41:58.8722284Z Entering 'third_party/NNPACK' 2022-11-23T01:41:58.8771992Z Entering 'third_party/QNNPACK' 2022-11-23T01:41:58.8823134Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:41:58.8873087Z Entering 'third_party/XNNPACK' 2022-11-23T01:41:58.8934928Z Entering 'third_party/benchmark' 2022-11-23T01:41:58.8984879Z Entering 'third_party/cpuinfo' 2022-11-23T01:41:58.9033824Z Entering 'third_party/cub' 2022-11-23T01:41:58.9087979Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:41:58.9145144Z Entering 'third_party/cutlass' 2022-11-23T01:41:58.9199590Z Entering 'third_party/eigen' 2022-11-23T01:41:58.9251088Z Entering 'third_party/fbgemm' 2022-11-23T01:41:58.9298906Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:41:58.9348833Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:41:58.9398623Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:41:58.9446782Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:41:58.9497721Z Entering 'third_party/flatbuffers' 2022-11-23T01:41:58.9547747Z Entering 'third_party/fmt' 2022-11-23T01:41:58.9594452Z Entering 'third_party/foxi' 2022-11-23T01:41:58.9642490Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:41:58.9695162Z Entering 'third_party/gloo' 2022-11-23T01:41:58.9745856Z Entering 'third_party/googletest' 2022-11-23T01:41:58.9795957Z Entering 'third_party/ideep' 2022-11-23T01:41:58.9843263Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:41:58.9896783Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:41:58.9952845Z Entering 'third_party/ios-cmake' 2022-11-23T01:41:58.9997848Z Entering 'third_party/ittapi' 2022-11-23T01:41:59.0045594Z Entering 'third_party/kineto' 2022-11-23T01:41:59.0093466Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:41:59.0142309Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:41:59.0191407Z Entering 'third_party/nccl/nccl' 2022-11-23T01:41:59.0238692Z Entering 'third_party/neon2sse' 2022-11-23T01:41:59.0285334Z Entering 'third_party/nlohmann' 2022-11-23T01:41:59.0334277Z Entering 'third_party/onnx' 2022-11-23T01:41:59.0396226Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.0443780Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.0493844Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:41:59.0541615Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:41:59.0594684Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.0643013Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.0690134Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.0744627Z Entering 'third_party/pocketfft' 2022-11-23T01:41:59.0794068Z Entering 'third_party/protobuf' 2022-11-23T01:41:59.0845248Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:41:59.0894154Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:41:59.0945179Z Entering 'third_party/psimd' 2022-11-23T01:41:59.0992392Z Entering 'third_party/pthreadpool' 2022-11-23T01:41:59.1040850Z Entering 'third_party/pybind11' 2022-11-23T01:41:59.1090218Z Entering 'third_party/python-enum' 2022-11-23T01:41:59.1137154Z Entering 'third_party/python-peachpy' 2022-11-23T01:41:59.1320463Z Entering 'third_party/python-six' 2022-11-23T01:41:59.1396259Z Entering 'third_party/sleef' 2022-11-23T01:41:59.1557339Z Entering 'third_party/tbb' 2022-11-23T01:41:59.1607812Z Entering 'third_party/tensorpipe' 2022-11-23T01:41:59.1654402Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:41:59.1702217Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:41:59.1747476Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:41:59.1795312Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:41:59.1839517Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.1889688Z Entering 'third_party/zstd' 2022-11-23T01:41:59.1951317Z ##[endgroup] 2022-11-23T01:41:59.1951915Z ##[group]Persisting credentials for submodules 2022-11-23T01:41:59.1957038Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || : 2022-11-23T01:41:59.2296619Z Entering 'android/libs/fbjni' 2022-11-23T01:41:59.2341470Z Entering 'third_party/FP16' 2022-11-23T01:41:59.2387165Z Entering 'third_party/FXdiv' 2022-11-23T01:41:59.2431844Z Entering 'third_party/NNPACK' 2022-11-23T01:41:59.2477486Z Entering 'third_party/QNNPACK' 2022-11-23T01:41:59.2522413Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:41:59.2568983Z Entering 'third_party/XNNPACK' 2022-11-23T01:41:59.2626008Z Entering 'third_party/benchmark' 2022-11-23T01:41:59.2669255Z Entering 'third_party/cpuinfo' 2022-11-23T01:41:59.2714773Z Entering 'third_party/cub' 2022-11-23T01:41:59.2759775Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:41:59.2811330Z Entering 'third_party/cutlass' 2022-11-23T01:41:59.2863533Z Entering 'third_party/eigen' 2022-11-23T01:41:59.2910756Z Entering 'third_party/fbgemm' 2022-11-23T01:41:59.2955531Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:41:59.2999913Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:41:59.3043426Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:41:59.3086774Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:41:59.3132048Z Entering 'third_party/flatbuffers' 2022-11-23T01:41:59.3177574Z Entering 'third_party/fmt' 2022-11-23T01:41:59.3223035Z Entering 'third_party/foxi' 2022-11-23T01:41:59.3267342Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:41:59.3312800Z Entering 'third_party/gloo' 2022-11-23T01:41:59.3363073Z Entering 'third_party/googletest' 2022-11-23T01:41:59.3409400Z Entering 'third_party/ideep' 2022-11-23T01:41:59.3456780Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:41:59.3504331Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:41:59.3559078Z Entering 'third_party/ios-cmake' 2022-11-23T01:41:59.3603041Z Entering 'third_party/ittapi' 2022-11-23T01:41:59.3654397Z Entering 'third_party/kineto' 2022-11-23T01:41:59.3705478Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:41:59.3750932Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:41:59.3798898Z Entering 'third_party/nccl/nccl' 2022-11-23T01:41:59.3847209Z Entering 'third_party/neon2sse' 2022-11-23T01:41:59.3893143Z Entering 'third_party/nlohmann' 2022-11-23T01:41:59.3946332Z Entering 'third_party/onnx' 2022-11-23T01:41:59.4008726Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.4057079Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.4109856Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:41:59.4157855Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:41:59.4208938Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.4257291Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.4305052Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.4359068Z Entering 'third_party/pocketfft' 2022-11-23T01:41:59.4405833Z Entering 'third_party/protobuf' 2022-11-23T01:41:59.4456589Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:41:59.4503591Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:41:59.4552728Z Entering 'third_party/psimd' 2022-11-23T01:41:59.4599652Z Entering 'third_party/pthreadpool' 2022-11-23T01:41:59.4646906Z Entering 'third_party/pybind11' 2022-11-23T01:41:59.4695598Z Entering 'third_party/python-enum' 2022-11-23T01:41:59.4745092Z Entering 'third_party/python-peachpy' 2022-11-23T01:41:59.4792807Z Entering 'third_party/python-six' 2022-11-23T01:41:59.4841574Z Entering 'third_party/sleef' 2022-11-23T01:41:59.4893257Z Entering 'third_party/tbb' 2022-11-23T01:41:59.4944030Z Entering 'third_party/tensorpipe' 2022-11-23T01:41:59.4993031Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:41:59.5042317Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:41:59.5091036Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:41:59.5140199Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:41:59.5187891Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.5243681Z Entering 'third_party/zstd' 2022-11-23T01:41:59.5306956Z [command]/usr/bin/git submodule foreach --recursive git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url 2022-11-23T01:41:59.5680646Z Entering 'android/libs/fbjni' 2022-11-23T01:41:59.5724430Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2022-11-23T01:41:59.5745506Z Entering 'third_party/FP16' 2022-11-23T01:41:59.5788449Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2022-11-23T01:41:59.5808603Z Entering 'third_party/FXdiv' 2022-11-23T01:41:59.5853377Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2022-11-23T01:41:59.5875016Z Entering 'third_party/NNPACK' 2022-11-23T01:41:59.5920231Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2022-11-23T01:41:59.5940640Z Entering 'third_party/QNNPACK' 2022-11-23T01:41:59.5986568Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/QNNPACK/config remote.origin.url 2022-11-23T01:41:59.6007092Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:41:59.6054623Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2022-11-23T01:41:59.6072598Z Entering 'third_party/XNNPACK' 2022-11-23T01:41:59.6117362Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2022-11-23T01:41:59.6151325Z Entering 'third_party/benchmark' 2022-11-23T01:41:59.6194993Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:41:59.6216147Z Entering 'third_party/cpuinfo' 2022-11-23T01:41:59.6259306Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:41:59.6282309Z Entering 'third_party/cub' 2022-11-23T01:41:59.6326497Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cub/config remote.origin.url 2022-11-23T01:41:59.6346407Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:41:59.6393374Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2022-11-23T01:41:59.6421262Z Entering 'third_party/cutlass' 2022-11-23T01:41:59.6468546Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2022-11-23T01:41:59.6497660Z Entering 'third_party/eigen' 2022-11-23T01:41:59.6543361Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2022-11-23T01:41:59.6565396Z Entering 'third_party/fbgemm' 2022-11-23T01:41:59.6611985Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2022-11-23T01:41:59.6630867Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:41:59.6676551Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2022-11-23T01:41:59.6697053Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:41:59.6745585Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:41:59.6767672Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:41:59.6810449Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:41:59.6830180Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:41:59.6875237Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2022-11-23T01:41:59.6895935Z Entering 'third_party/flatbuffers' 2022-11-23T01:41:59.6941874Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2022-11-23T01:41:59.6962807Z Entering 'third_party/fmt' 2022-11-23T01:41:59.7011221Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2022-11-23T01:41:59.7031573Z Entering 'third_party/foxi' 2022-11-23T01:41:59.7077017Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2022-11-23T01:41:59.7095776Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:41:59.7143739Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2022-11-23T01:41:59.7166548Z Entering 'third_party/gloo' 2022-11-23T01:41:59.7209427Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2022-11-23T01:41:59.7229441Z Entering 'third_party/googletest' 2022-11-23T01:41:59.7273141Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:41:59.7292756Z Entering 'third_party/ideep' 2022-11-23T01:41:59.7336652Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2022-11-23T01:41:59.7355659Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:41:59.7399230Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2022-11-23T01:41:59.7422187Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:41:59.7467836Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/modules/third_party/oneDNN/config remote.origin.url 2022-11-23T01:41:59.7497549Z Entering 'third_party/ios-cmake' 2022-11-23T01:41:59.7542281Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ios-cmake/config remote.origin.url 2022-11-23T01:41:59.7562706Z Entering 'third_party/ittapi' 2022-11-23T01:41:59.7605528Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2022-11-23T01:41:59.7627453Z Entering 'third_party/kineto' 2022-11-23T01:41:59.7673371Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2022-11-23T01:41:59.7692152Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:41:59.7736342Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2022-11-23T01:41:59.7756195Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:41:59.7800953Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2022-11-23T01:41:59.7825480Z Entering 'third_party/nccl/nccl' 2022-11-23T01:41:59.7871057Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2022-11-23T01:41:59.7892147Z Entering 'third_party/neon2sse' 2022-11-23T01:41:59.7936298Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/neon2sse/config remote.origin.url 2022-11-23T01:41:59.7956005Z Entering 'third_party/nlohmann' 2022-11-23T01:41:59.7999752Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2022-11-23T01:41:59.8023717Z Entering 'third_party/onnx' 2022-11-23T01:41:59.8070609Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:41:59.8104032Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.8148581Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:41:59.8168484Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.8213668Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:41:59.8236740Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:41:59.8279792Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/config remote.origin.url 2022-11-23T01:41:59.8300404Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:41:59.8344672Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:41:59.8370364Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:41:59.8415442Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:41:59.8435757Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:41:59.8481601Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:41:59.8505525Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.8546119Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:41:59.8574416Z Entering 'third_party/pocketfft' 2022-11-23T01:41:59.8619720Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2022-11-23T01:41:59.8640135Z Entering 'third_party/protobuf' 2022-11-23T01:41:59.8687104Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2022-11-23T01:41:59.8711154Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:41:59.8755320Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:41:59.8779448Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:41:59.8823822Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:41:59.8845289Z Entering 'third_party/psimd' 2022-11-23T01:41:59.8890382Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2022-11-23T01:41:59.8909333Z Entering 'third_party/pthreadpool' 2022-11-23T01:41:59.8954161Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2022-11-23T01:41:59.8974505Z Entering 'third_party/pybind11' 2022-11-23T01:41:59.9019802Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:41:59.9038215Z Entering 'third_party/python-enum' 2022-11-23T01:41:59.9082982Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-enum/config remote.origin.url 2022-11-23T01:41:59.9102771Z Entering 'third_party/python-peachpy' 2022-11-23T01:41:59.9147629Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2022-11-23T01:41:59.9168414Z Entering 'third_party/python-six' 2022-11-23T01:41:59.9214055Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-six/config remote.origin.url 2022-11-23T01:41:59.9234261Z Entering 'third_party/sleef' 2022-11-23T01:41:59.9279318Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2022-11-23T01:41:59.9298433Z Entering 'third_party/tbb' 2022-11-23T01:41:59.9345725Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tbb/config remote.origin.url 2022-11-23T01:41:59.9367652Z Entering 'third_party/tensorpipe' 2022-11-23T01:41:59.9415512Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2022-11-23T01:41:59.9434498Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:41:59.9483479Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:41:59.9503599Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:41:59.9550180Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2022-11-23T01:41:59.9570523Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:41:59.9617661Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2022-11-23T01:41:59.9640432Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:41:59.9684810Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:41:59.9704302Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:41:59.9752431Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:41:59.9778241Z Entering 'third_party/zstd' 2022-11-23T01:41:59.9823692Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/zstd/config remote.origin.url 2022-11-23T01:42:00.0823631Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2022-11-23T01:42:00.1179822Z Entering 'android/libs/fbjni' 2022-11-23T01:42:00.1226806Z Entering 'third_party/FP16' 2022-11-23T01:42:00.1273750Z Entering 'third_party/FXdiv' 2022-11-23T01:42:00.1320151Z Entering 'third_party/NNPACK' 2022-11-23T01:42:00.1366002Z Entering 'third_party/QNNPACK' 2022-11-23T01:42:00.1413513Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:42:00.1459872Z Entering 'third_party/XNNPACK' 2022-11-23T01:42:00.1518370Z Entering 'third_party/benchmark' 2022-11-23T01:42:00.1567881Z Entering 'third_party/cpuinfo' 2022-11-23T01:42:00.1621791Z Entering 'third_party/cub' 2022-11-23T01:42:00.1675345Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:42:00.1736297Z Entering 'third_party/cutlass' 2022-11-23T01:42:00.1797296Z Entering 'third_party/eigen' 2022-11-23T01:42:00.1848725Z Entering 'third_party/fbgemm' 2022-11-23T01:42:00.1898252Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:42:00.1947000Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:42:00.1996196Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:42:00.2043934Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:42:00.2094412Z Entering 'third_party/flatbuffers' 2022-11-23T01:42:00.2145886Z Entering 'third_party/fmt' 2022-11-23T01:42:00.2195448Z Entering 'third_party/foxi' 2022-11-23T01:42:00.2244125Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:42:00.2297871Z Entering 'third_party/gloo' 2022-11-23T01:42:00.2350476Z Entering 'third_party/googletest' 2022-11-23T01:42:00.2398366Z Entering 'third_party/ideep' 2022-11-23T01:42:00.2449036Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:42:00.2499954Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:42:00.2559326Z Entering 'third_party/ios-cmake' 2022-11-23T01:42:00.2609500Z Entering 'third_party/ittapi' 2022-11-23T01:42:00.2659555Z Entering 'third_party/kineto' 2022-11-23T01:42:00.2712046Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:42:00.2762352Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:42:00.2813671Z Entering 'third_party/nccl/nccl' 2022-11-23T01:42:00.2865300Z Entering 'third_party/neon2sse' 2022-11-23T01:42:00.2916253Z Entering 'third_party/nlohmann' 2022-11-23T01:42:00.2973165Z Entering 'third_party/onnx' 2022-11-23T01:42:00.3035752Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:42:00.3088529Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:42:00.3144253Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:42:00.3195346Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:42:00.3252108Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:42:00.3300538Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:42:00.3351031Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:42:00.3407071Z Entering 'third_party/pocketfft' 2022-11-23T01:42:00.3458883Z Entering 'third_party/protobuf' 2022-11-23T01:42:00.3511311Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:42:00.3562942Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:42:00.3616928Z Entering 'third_party/psimd' 2022-11-23T01:42:00.3668697Z Entering 'third_party/pthreadpool' 2022-11-23T01:42:00.3716734Z Entering 'third_party/pybind11' 2022-11-23T01:42:00.3767888Z Entering 'third_party/python-enum' 2022-11-23T01:42:00.3817987Z Entering 'third_party/python-peachpy' 2022-11-23T01:42:00.3865097Z Entering 'third_party/python-six' 2022-11-23T01:42:00.3915977Z Entering 'third_party/sleef' 2022-11-23T01:42:00.3965011Z Entering 'third_party/tbb' 2022-11-23T01:42:00.4016925Z Entering 'third_party/tensorpipe' 2022-11-23T01:42:00.4066805Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:42:00.4117563Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:42:00.4168163Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:42:00.4221078Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:42:00.4271964Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:42:00.4328682Z Entering 'third_party/zstd' 2022-11-23T01:42:00.4393348Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2022-11-23T01:42:00.4769434Z Entering 'android/libs/fbjni' 2022-11-23T01:42:00.4821919Z Entering 'third_party/FP16' 2022-11-23T01:42:00.4874077Z Entering 'third_party/FXdiv' 2022-11-23T01:42:00.4922043Z Entering 'third_party/NNPACK' 2022-11-23T01:42:00.4974142Z Entering 'third_party/QNNPACK' 2022-11-23T01:42:00.5024134Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:42:00.5075806Z Entering 'third_party/XNNPACK' 2022-11-23T01:42:00.5137563Z Entering 'third_party/benchmark' 2022-11-23T01:42:00.5187821Z Entering 'third_party/cpuinfo' 2022-11-23T01:42:00.5242748Z Entering 'third_party/cub' 2022-11-23T01:42:00.5293385Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:42:00.5349289Z Entering 'third_party/cutlass' 2022-11-23T01:42:00.5408868Z Entering 'third_party/eigen' 2022-11-23T01:42:00.5462383Z Entering 'third_party/fbgemm' 2022-11-23T01:42:00.5516693Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:42:00.5566957Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:42:00.5615005Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:42:00.5666133Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:42:00.5719752Z Entering 'third_party/flatbuffers' 2022-11-23T01:42:00.5771510Z Entering 'third_party/fmt' 2022-11-23T01:42:00.5818792Z Entering 'third_party/foxi' 2022-11-23T01:42:00.5871789Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:42:00.5920281Z Entering 'third_party/gloo' 2022-11-23T01:42:00.5970425Z Entering 'third_party/googletest' 2022-11-23T01:42:00.6018887Z Entering 'third_party/ideep' 2022-11-23T01:42:00.6068139Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:42:00.6118750Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:42:00.6177512Z Entering 'third_party/ios-cmake' 2022-11-23T01:42:00.6229283Z Entering 'third_party/ittapi' 2022-11-23T01:42:00.6278100Z Entering 'third_party/kineto' 2022-11-23T01:42:00.6329395Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:42:00.6377827Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:42:00.6429799Z Entering 'third_party/nccl/nccl' 2022-11-23T01:42:00.6479025Z Entering 'third_party/neon2sse' 2022-11-23T01:42:00.6532223Z Entering 'third_party/nlohmann' 2022-11-23T01:42:00.6584311Z Entering 'third_party/onnx' 2022-11-23T01:42:00.6648680Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:42:00.6701445Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:42:00.6757126Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:42:00.6805066Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:42:00.6862555Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:42:00.6914280Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:42:00.6963862Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:42:00.7024910Z Entering 'third_party/pocketfft' 2022-11-23T01:42:00.7078576Z Entering 'third_party/protobuf' 2022-11-23T01:42:00.7134554Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:42:00.7182951Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:42:00.7237391Z Entering 'third_party/psimd' 2022-11-23T01:42:00.7288605Z Entering 'third_party/pthreadpool' 2022-11-23T01:42:00.7338117Z Entering 'third_party/pybind11' 2022-11-23T01:42:00.7388687Z Entering 'third_party/python-enum' 2022-11-23T01:42:00.7438620Z Entering 'third_party/python-peachpy' 2022-11-23T01:42:00.7489454Z Entering 'third_party/python-six' 2022-11-23T01:42:00.7538474Z Entering 'third_party/sleef' 2022-11-23T01:42:00.7588499Z Entering 'third_party/tbb' 2022-11-23T01:42:00.7639077Z Entering 'third_party/tensorpipe' 2022-11-23T01:42:00.7690293Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:42:00.7739355Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:42:00.7789331Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:42:00.7838462Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:42:00.7886151Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:42:00.7941243Z Entering 'third_party/zstd' 2022-11-23T01:42:00.8013258Z ##[endgroup] 2022-11-23T01:42:00.8066073Z [command]/usr/bin/git log -1 --format='%H' 2022-11-23T01:42:00.8100334Z '1cfd3858ac54fe3883534309081631a0a892ba3f' 2022-11-23T01:42:00.8273326Z Prepare all required actions 2022-11-23T01:42:00.8304338Z ##[group]Run ./.github/actions/setup-linux 2022-11-23T01:42:00.8304618Z env: 2022-11-23T01:42:00.8304867Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:00.8305132Z ##[endgroup] 2022-11-23T01:42:00.8322291Z ##[group]Run set -euo pipefail 2022-11-23T01:42:00.8322781Z set -euo pipefail 2022-11-23T01:42:00.8323072Z function get_ec2_metadata() { 2022-11-23T01:42:00.8323411Z  # Pulled from instance metadata endpoint for EC2 2022-11-23T01:42:00.8323949Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2022-11-23T01:42:00.8324354Z  category=$1 2022-11-23T01:42:00.8324690Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2022-11-23T01:42:00.8325146Z } 2022-11-23T01:42:00.8325411Z echo "ami-id: $(get_ec2_metadata ami-id)" 2022-11-23T01:42:00.8325797Z echo "instance-id: $(get_ec2_metadata instance-id)" 2022-11-23T01:42:00.8326178Z echo "instance-type: $(get_ec2_metadata instance-type)" 2022-11-23T01:42:00.8326498Z echo "system info $(uname -a)" 2022-11-23T01:42:00.8340153Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:42:00.8340450Z env: 2022-11-23T01:42:00.8340672Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:00.8341088Z ##[endgroup] 2022-11-23T01:42:00.8457514Z ami-id: ami-096198a0bccc6bad4 2022-11-23T01:42:00.8529616Z instance-id: i-08a957f819e89e94d 2022-11-23T01:42:00.8598474Z instance-type: g3.16xlarge 2022-11-23T01:42:00.8608685Z system info Linux ip-10-0-8-67.ec2.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 2022-11-23T01:42:00.8631663Z ##[group]Run if systemctl is-active --quiet docker; then 2022-11-23T01:42:00.8632048Z if systemctl is-active --quiet docker; then 2022-11-23T01:42:00.8632367Z  echo "Docker daemon is running..."; 2022-11-23T01:42:00.8632659Z else 2022-11-23T01:42:00.8632977Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2022-11-23T01:42:00.8633287Z fi 2022-11-23T01:42:00.8646904Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:42:00.8647213Z env: 2022-11-23T01:42:00.8647579Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:00.8647823Z ##[endgroup] 2022-11-23T01:42:00.8705087Z Docker daemon is running... 2022-11-23T01:42:00.8723712Z ##[group]Run AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:42:00.8724193Z AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:42:00.8724595Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:42:00.8725461Z retry aws ecr get-login*** "$AWS_DEFAULT_REGION" | docker login --username AWS \ 2022-11-23T01:42:00.8725948Z  --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2022-11-23T01:42:00.8737582Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:42:00.8737889Z env: 2022-11-23T01:42:00.8738135Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:00.8738382Z AWS_RETRY_MODE: standard 2022-11-23T01:42:00.8738653Z AWS_MAX_ATTEMPTS: 5 2022-11-23T01:42:00.8738934Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T01:42:00.8739178Z ##[endgroup] 2022-11-23T01:42:01.8508172Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2022-11-23T01:42:01.8508591Z Configure a credential helper to remove this warning. See 2022-11-23T01:42:01.8509137Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2022-11-23T01:42:01.8509413Z 2022-11-23T01:42:01.8509983Z Login Succeeded 2022-11-23T01:42:01.8581532Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:42:01.8581950Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:42:01.8582427Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:42:01.8596446Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:42:01.8596737Z env: 2022-11-23T01:42:01.8597162Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:01.8597410Z ##[endgroup] 2022-11-23T01:42:01.8699729Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2022-11-23T01:42:01.8700082Z with: 2022-11-23T01:42:01.8700579Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:42:01.8701054Z env: 2022-11-23T01:42:01.8701308Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:01.8701691Z ##[endgroup] 2022-11-23T01:42:01.8717691Z ##[group]Run retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:42:01.8718214Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:42:01.8718586Z # ignore output since only exit code is used for conditional 2022-11-23T01:42:01.8719076Z # only pull docker image if it's not available locally 2022-11-23T01:42:01.8719472Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2022-11-23T01:42:01.8719788Z  retry docker pull "${DOCKER_IMAGE}" 2022-11-23T01:42:01.8720075Z fi 2022-11-23T01:42:01.8734590Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:42:01.8734875Z env: 2022-11-23T01:42:01.8735129Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:42:01.8735654Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:42:01.8736139Z ##[endgroup] 2022-11-23T01:42:02.1392324Z 072aae4a77ed7d3a69ad5683420509c41301b940: Pulling from pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7 2022-11-23T01:42:02.1392786Z a404e5416296: Pulling fs layer 2022-11-23T01:42:02.1393169Z c58c079e9b17: Pulling fs layer 2022-11-23T01:42:02.1394057Z e5b80b8bbe91: Pulling fs layer 2022-11-23T01:42:02.1394405Z 888240790290: Pulling fs layer 2022-11-23T01:42:02.1394703Z 515fe5e34eb4: Pulling fs layer 2022-11-23T01:42:02.1395018Z 4e4521f12f5a: Pulling fs layer 2022-11-23T01:42:02.1395308Z f6e1a56cb32d: Pulling fs layer 2022-11-23T01:42:02.1395538Z c29b96e36bd0: Pulling fs layer 2022-11-23T01:42:02.1395873Z 304d3c6c28d0: Pulling fs layer 2022-11-23T01:42:02.1396056Z fac00e927cfe: Pulling fs layer 2022-11-23T01:42:02.1396383Z f0158c8d8420: Pulling fs layer 2022-11-23T01:42:02.1396789Z 3ceac802dd07: Pulling fs layer 2022-11-23T01:42:02.1396997Z 0d0e625ba887: Pulling fs layer 2022-11-23T01:42:02.1397261Z bc2be817cb7e: Pulling fs layer 2022-11-23T01:42:02.1397811Z 11eb2106b948: Pulling fs layer 2022-11-23T01:42:02.1398055Z 888240790290: Waiting 2022-11-23T01:42:02.1398298Z f6e1a56cb32d: Waiting 2022-11-23T01:42:02.1398645Z c29b96e36bd0: Waiting 2022-11-23T01:42:02.1398918Z 34fa4193c7a6: Pulling fs layer 2022-11-23T01:42:02.1399091Z a7cf5b3894f8: Pulling fs layer 2022-11-23T01:42:02.1399380Z 3f6b06edd3f5: Pulling fs layer 2022-11-23T01:42:02.1399734Z 73a2b1f75a3d: Pulling fs layer 2022-11-23T01:42:02.1399899Z 304d3c6c28d0: Waiting 2022-11-23T01:42:02.1400161Z ba6235196410: Pulling fs layer 2022-11-23T01:42:02.1405536Z 879cdaf83543: Pulling fs layer 2022-11-23T01:42:02.1406017Z fac00e927cfe: Waiting 2022-11-23T01:42:02.1406534Z 3ceac802dd07: Waiting 2022-11-23T01:42:02.1406987Z 6ff0fc00b0a9: Pulling fs layer 2022-11-23T01:42:02.1407481Z f0158c8d8420: Waiting 2022-11-23T01:42:02.1408014Z a58b9ed071f4: Pulling fs layer 2022-11-23T01:42:02.1408412Z 73a2b1f75a3d: Waiting 2022-11-23T01:42:02.1408638Z 11eb2106b948: Waiting 2022-11-23T01:42:02.1408838Z a8c562f6a1cf: Pulling fs layer 2022-11-23T01:42:02.1409126Z 0d0e625ba887: Waiting 2022-11-23T01:42:02.1409351Z 34fa4193c7a6: Waiting 2022-11-23T01:42:02.1409593Z 3f6b06edd3f5: Waiting 2022-11-23T01:42:02.1409915Z 0a39b4492650: Pulling fs layer 2022-11-23T01:42:02.1410111Z 9088ff8de269: Pulling fs layer 2022-11-23T01:42:02.1410372Z 515fe5e34eb4: Waiting 2022-11-23T01:42:02.1410632Z 165006759af3: Pulling fs layer 2022-11-23T01:42:02.1410873Z a8c562f6a1cf: Waiting 2022-11-23T01:42:02.1411223Z 0a39b4492650: Waiting 2022-11-23T01:42:02.1411476Z ae48b7377a0d: Pulling fs layer 2022-11-23T01:42:02.1411938Z b18965f4b6f1: Pulling fs layer 2022-11-23T01:42:02.1412264Z ae48b7377a0d: Waiting 2022-11-23T01:42:02.1412467Z 102ddcd90753: Pulling fs layer 2022-11-23T01:42:02.1412726Z 5f5dd1cba120: Pulling fs layer 2022-11-23T01:42:02.1413012Z 8a7f50c8b503: Pulling fs layer 2022-11-23T01:42:02.1413332Z a7cf5b3894f8: Waiting 2022-11-23T01:42:02.1413517Z 863c35620b44: Pulling fs layer 2022-11-23T01:42:02.1413775Z 102ddcd90753: Waiting 2022-11-23T01:42:02.1414017Z ba6235196410: Waiting 2022-11-23T01:42:02.1414453Z 183e4209dc37: Pulling fs layer 2022-11-23T01:42:02.1414693Z a47cba6c334e: Pulling fs layer 2022-11-23T01:42:02.1414932Z 8a7f50c8b503: Waiting 2022-11-23T01:42:02.1415196Z a9f3d4742233: Pulling fs layer 2022-11-23T01:42:02.1415457Z 3cefa8a4607f: Pulling fs layer 2022-11-23T01:42:02.1415815Z 879cdaf83543: Waiting 2022-11-23T01:42:02.1415994Z 6ff0fc00b0a9: Waiting 2022-11-23T01:42:02.1416406Z 183e4209dc37: Waiting 2022-11-23T01:42:02.1416580Z a47cba6c334e: Waiting 2022-11-23T01:42:02.1416942Z 023a41fa48e6: Pulling fs layer 2022-11-23T01:42:02.1417096Z 3cefa8a4607f: Waiting 2022-11-23T01:42:02.1417352Z 96e251412f4d: Pulling fs layer 2022-11-23T01:42:02.1417626Z 49d40c00cf56: Pulling fs layer 2022-11-23T01:42:02.1417868Z bc2be817cb7e: Waiting 2022-11-23T01:42:02.1418193Z 7e2d6313145f: Pulling fs layer 2022-11-23T01:42:02.1418394Z 023a41fa48e6: Waiting 2022-11-23T01:42:02.1418626Z 96805775a692: Pulling fs layer 2022-11-23T01:42:02.1418895Z 75f1ead35ace: Pulling fs layer 2022-11-23T01:42:02.1419247Z 793c37004dab: Pulling fs layer 2022-11-23T01:42:02.1419450Z cadc5661750d: Pulling fs layer 2022-11-23T01:42:02.1419689Z 49d40c00cf56: Waiting 2022-11-23T01:42:02.1419933Z 7e2d6313145f: Waiting 2022-11-23T01:42:02.1420194Z 6386b2adbe28: Pulling fs layer 2022-11-23T01:42:02.1420497Z b18965f4b6f1: Waiting 2022-11-23T01:42:02.1420688Z 74aa250bc82f: Pulling fs layer 2022-11-23T01:42:02.1420940Z 96805775a692: Waiting 2022-11-23T01:42:02.1421171Z 436525efe61d: Pulling fs layer 2022-11-23T01:42:02.1421552Z 596be1fe0bda: Pulling fs layer 2022-11-23T01:42:02.1421736Z 772fa4efddc3: Pulling fs layer 2022-11-23T01:42:02.1422049Z cadc5661750d: Waiting 2022-11-23T01:42:02.1422239Z 91ddf385377b: Pulling fs layer 2022-11-23T01:42:02.1422516Z 75f1ead35ace: Waiting 2022-11-23T01:42:02.1422796Z 9f7cfb895784: Pulling fs layer 2022-11-23T01:42:02.1423088Z 8b8218af0479: Pulling fs layer 2022-11-23T01:42:02.1423272Z 772fa4efddc3: Waiting 2022-11-23T01:42:02.1423495Z 91ddf385377b: Waiting 2022-11-23T01:42:02.1423745Z 6386b2adbe28: Waiting 2022-11-23T01:42:02.1424450Z 5f5dd1cba120: Waiting 2022-11-23T01:42:02.1424675Z 74aa250bc82f: Waiting 2022-11-23T01:42:02.1424916Z 793c37004dab: Waiting 2022-11-23T01:42:02.1425161Z 96e251412f4d: Waiting 2022-11-23T01:42:02.1425397Z 596be1fe0bda: Waiting 2022-11-23T01:42:02.1425644Z 8b8218af0479: Waiting 2022-11-23T01:42:02.2781190Z c58c079e9b17: Verifying Checksum 2022-11-23T01:42:02.2781651Z c58c079e9b17: Download complete 2022-11-23T01:42:02.3699862Z 888240790290: Verifying Checksum 2022-11-23T01:42:02.3700183Z 888240790290: Download complete 2022-11-23T01:42:02.4497399Z e5b80b8bbe91: Verifying Checksum 2022-11-23T01:42:02.4497783Z e5b80b8bbe91: Download complete 2022-11-23T01:42:02.4564691Z 515fe5e34eb4: Download complete 2022-11-23T01:42:02.5381525Z a404e5416296: Verifying Checksum 2022-11-23T01:42:02.5382244Z a404e5416296: Download complete 2022-11-23T01:42:02.5411278Z f6e1a56cb32d: Download complete 2022-11-23T01:42:02.6572415Z 304d3c6c28d0: Verifying Checksum 2022-11-23T01:42:02.6572796Z 304d3c6c28d0: Download complete 2022-11-23T01:42:02.7588681Z fac00e927cfe: Verifying Checksum 2022-11-23T01:42:03.2846309Z fac00e927cfe: Download complete 2022-11-23T01:42:03.2846575Z a404e5416296: Pull complete 2022-11-23T01:42:03.5782039Z c58c079e9b17: Pull complete 2022-11-23T01:42:04.0824555Z e5b80b8bbe91: Pull complete 2022-11-23T01:42:04.2037330Z 888240790290: Pull complete 2022-11-23T01:42:04.3259749Z 515fe5e34eb4: Pull complete 2022-11-23T01:42:04.8112406Z f0158c8d8420: Verifying Checksum 2022-11-23T01:42:04.8113105Z f0158c8d8420: Download complete 2022-11-23T01:42:04.8989839Z 3ceac802dd07: Download complete 2022-11-23T01:42:04.9963737Z 0d0e625ba887: Verifying Checksum 2022-11-23T01:42:04.9964091Z 0d0e625ba887: Download complete 2022-11-23T01:42:05.1008347Z bc2be817cb7e: Verifying Checksum 2022-11-23T01:42:05.1008701Z bc2be817cb7e: Download complete 2022-11-23T01:42:05.8283217Z 11eb2106b948: Verifying Checksum 2022-11-23T01:42:05.8283717Z 11eb2106b948: Download complete 2022-11-23T01:42:05.9103141Z 34fa4193c7a6: Verifying Checksum 2022-11-23T01:42:05.9103478Z 34fa4193c7a6: Download complete 2022-11-23T01:42:05.9900344Z a7cf5b3894f8: Verifying Checksum 2022-11-23T01:42:05.9900727Z a7cf5b3894f8: Download complete 2022-11-23T01:42:13.6711185Z 4e4521f12f5a: Verifying Checksum 2022-11-23T01:42:13.6711567Z 4e4521f12f5a: Download complete 2022-11-23T01:42:13.7504156Z 73a2b1f75a3d: Download complete 2022-11-23T01:42:13.8328071Z ba6235196410: Verifying Checksum 2022-11-23T01:42:13.8328415Z ba6235196410: Download complete 2022-11-23T01:42:13.9050103Z 879cdaf83543: Download complete 2022-11-23T01:42:13.9708913Z 6ff0fc00b0a9: Verifying Checksum 2022-11-23T01:42:13.9709477Z 6ff0fc00b0a9: Download complete 2022-11-23T01:42:14.0826225Z a58b9ed071f4: Verifying Checksum 2022-11-23T01:42:14.0826637Z a58b9ed071f4: Download complete 2022-11-23T01:42:14.1540759Z a8c562f6a1cf: Verifying Checksum 2022-11-23T01:42:14.1541019Z a8c562f6a1cf: Download complete 2022-11-23T01:42:15.0970005Z 0a39b4492650: Verifying Checksum 2022-11-23T01:42:15.0970406Z 0a39b4492650: Download complete 2022-11-23T01:42:15.1784551Z 9088ff8de269: Verifying Checksum 2022-11-23T01:42:15.1784869Z 9088ff8de269: Download complete 2022-11-23T01:42:15.2765576Z 165006759af3: Verifying Checksum 2022-11-23T01:42:15.2765928Z 165006759af3: Download complete 2022-11-23T01:42:15.3705068Z ae48b7377a0d: Verifying Checksum 2022-11-23T01:42:15.3705520Z ae48b7377a0d: Download complete 2022-11-23T01:42:15.4377215Z b18965f4b6f1: Verifying Checksum 2022-11-23T01:42:15.4377571Z b18965f4b6f1: Download complete 2022-11-23T01:42:15.5059558Z 102ddcd90753: Download complete 2022-11-23T01:42:16.9116486Z c29b96e36bd0: Verifying Checksum 2022-11-23T01:42:16.9116807Z c29b96e36bd0: Download complete 2022-11-23T01:42:16.9777481Z 8a7f50c8b503: Verifying Checksum 2022-11-23T01:42:16.9778020Z 8a7f50c8b503: Download complete 2022-11-23T01:42:17.0615064Z 863c35620b44: Verifying Checksum 2022-11-23T01:42:17.0615713Z 863c35620b44: Download complete 2022-11-23T01:42:17.4566198Z 183e4209dc37: Verifying Checksum 2022-11-23T01:42:17.4566545Z 183e4209dc37: Download complete 2022-11-23T01:42:17.4923950Z 5f5dd1cba120: Verifying Checksum 2022-11-23T01:42:17.4924330Z 5f5dd1cba120: Download complete 2022-11-23T01:42:17.5747556Z a9f3d4742233: Verifying Checksum 2022-11-23T01:42:17.5747977Z a9f3d4742233: Download complete 2022-11-23T01:42:17.6389425Z a47cba6c334e: Verifying Checksum 2022-11-23T01:42:17.6389728Z a47cba6c334e: Download complete 2022-11-23T01:42:17.7222416Z 023a41fa48e6: Verifying Checksum 2022-11-23T01:42:17.7222755Z 023a41fa48e6: Download complete 2022-11-23T01:42:17.8192419Z 3cefa8a4607f: Verifying Checksum 2022-11-23T01:42:17.8192779Z 3cefa8a4607f: Download complete 2022-11-23T01:42:17.8878454Z 49d40c00cf56: Verifying Checksum 2022-11-23T01:42:17.8878780Z 49d40c00cf56: Download complete 2022-11-23T01:42:17.9736513Z 7e2d6313145f: Verifying Checksum 2022-11-23T01:42:17.9736802Z 7e2d6313145f: Download complete 2022-11-23T01:42:18.1763777Z 96e251412f4d: Verifying Checksum 2022-11-23T01:42:18.1764174Z 96e251412f4d: Download complete 2022-11-23T01:42:18.2604785Z 75f1ead35ace: Verifying Checksum 2022-11-23T01:42:18.2605108Z 75f1ead35ace: Download complete 2022-11-23T01:42:18.3409049Z 793c37004dab: Download complete 2022-11-23T01:42:18.4183551Z cadc5661750d: Download complete 2022-11-23T01:42:18.4851625Z 6386b2adbe28: Verifying Checksum 2022-11-23T01:42:18.4851999Z 6386b2adbe28: Download complete 2022-11-23T01:42:18.6774171Z 74aa250bc82f: Verifying Checksum 2022-11-23T01:42:18.6774515Z 74aa250bc82f: Download complete 2022-11-23T01:42:18.7548994Z 436525efe61d: Download complete 2022-11-23T01:42:19.3620362Z 596be1fe0bda: Verifying Checksum 2022-11-23T01:42:19.3620678Z 596be1fe0bda: Download complete 2022-11-23T01:42:19.4511417Z 772fa4efddc3: Verifying Checksum 2022-11-23T01:42:19.4511710Z 772fa4efddc3: Download complete 2022-11-23T01:42:22.6293141Z 96805775a692: Verifying Checksum 2022-11-23T01:42:22.6293533Z 96805775a692: Download complete 2022-11-23T01:42:22.7068098Z 9f7cfb895784: Verifying Checksum 2022-11-23T01:42:22.7068672Z 9f7cfb895784: Download complete 2022-11-23T01:42:22.7870818Z 8b8218af0479: Verifying Checksum 2022-11-23T01:42:22.7871143Z 8b8218af0479: Download complete 2022-11-23T01:42:25.8047846Z 3f6b06edd3f5: Download complete 2022-11-23T01:42:27.0737508Z 4e4521f12f5a: Pull complete 2022-11-23T01:42:27.2255482Z f6e1a56cb32d: Pull complete 2022-11-23T01:42:47.6769251Z c29b96e36bd0: Pull complete 2022-11-23T01:42:49.3964608Z 304d3c6c28d0: Pull complete 2022-11-23T01:42:49.4538563Z 91ddf385377b: Verifying Checksum 2022-11-23T01:42:49.4543038Z 91ddf385377b: Download complete 2022-11-23T01:42:51.2572288Z fac00e927cfe: Pull complete 2022-11-23T01:42:58.7794952Z f0158c8d8420: Pull complete 2022-11-23T01:43:00.6555132Z 3ceac802dd07: Pull complete 2022-11-23T01:43:02.5010195Z 0d0e625ba887: Pull complete 2022-11-23T01:43:04.3813053Z bc2be817cb7e: Pull complete 2022-11-23T01:43:08.5722434Z 11eb2106b948: Pull complete 2022-11-23T01:43:11.1604015Z 34fa4193c7a6: Pull complete 2022-11-23T01:43:13.0381994Z a7cf5b3894f8: Pull complete 2022-11-23T01:43:45.1217274Z 3f6b06edd3f5: Pull complete 2022-11-23T01:43:46.9937058Z 73a2b1f75a3d: Pull complete 2022-11-23T01:43:48.8694303Z ba6235196410: Pull complete 2022-11-23T01:43:50.4310024Z 879cdaf83543: Pull complete 2022-11-23T01:43:52.1478272Z 6ff0fc00b0a9: Pull complete 2022-11-23T01:43:53.2016235Z a58b9ed071f4: Pull complete 2022-11-23T01:43:53.2993730Z a8c562f6a1cf: Pull complete 2022-11-23T01:43:55.3435622Z 0a39b4492650: Pull complete 2022-11-23T01:43:55.4368906Z 9088ff8de269: Pull complete 2022-11-23T01:43:55.5344113Z 165006759af3: Pull complete 2022-11-23T01:43:55.6833728Z ae48b7377a0d: Pull complete 2022-11-23T01:43:55.7832367Z b18965f4b6f1: Pull complete 2022-11-23T01:43:55.9099416Z 102ddcd90753: Pull complete 2022-11-23T01:44:03.4722090Z 5f5dd1cba120: Pull complete 2022-11-23T01:44:05.3169551Z 8a7f50c8b503: Pull complete 2022-11-23T01:44:07.1917996Z 863c35620b44: Pull complete 2022-11-23T01:44:09.9247335Z 183e4209dc37: Pull complete 2022-11-23T01:44:11.8176852Z a47cba6c334e: Pull complete 2022-11-23T01:44:14.8556508Z a9f3d4742233: Pull complete 2022-11-23T01:44:18.3957032Z 3cefa8a4607f: Pull complete 2022-11-23T01:44:20.6844347Z 023a41fa48e6: Pull complete 2022-11-23T01:44:24.6682179Z 96e251412f4d: Pull complete 2022-11-23T01:44:27.1343343Z 49d40c00cf56: Pull complete 2022-11-23T01:44:29.0767781Z 7e2d6313145f: Pull complete 2022-11-23T01:44:37.6042908Z 96805775a692: Pull complete 2022-11-23T01:44:39.4529787Z 75f1ead35ace: Pull complete 2022-11-23T01:44:41.2900817Z 793c37004dab: Pull complete 2022-11-23T01:44:43.1337100Z cadc5661750d: Pull complete 2022-11-23T01:44:44.9797479Z 6386b2adbe28: Pull complete 2022-11-23T01:44:48.3380927Z 74aa250bc82f: Pull complete 2022-11-23T01:44:50.1845744Z 436525efe61d: Pull complete 2022-11-23T01:44:52.3752513Z 596be1fe0bda: Pull complete 2022-11-23T01:44:52.4777283Z 772fa4efddc3: Pull complete 2022-11-23T01:45:32.4513455Z 91ddf385377b: Pull complete 2022-11-23T01:45:34.3304482Z 9f7cfb895784: Pull complete 2022-11-23T01:45:36.1784401Z 8b8218af0479: Pull complete 2022-11-23T01:45:37.4317352Z Digest: sha256:3a5626edfb2c43fb24303351be75287af92426b6bb7c6df2defc98f980346c6a 2022-11-23T01:45:37.9328971Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:45:38.2153155Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:45:38.2267907Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2022-11-23T01:45:38.2268252Z with: 2022-11-23T01:45:38.2268514Z driver-version: 515.76 2022-11-23T01:45:38.2268773Z env: 2022-11-23T01:45:38.2269002Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:45:38.2269276Z ##[endgroup] 2022-11-23T01:45:38.3723933Z ##[group]Run nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767 2022-11-23T01:45:38.3724256Z with: 2022-11-23T01:45:38.3724476Z timeout_minutes: 10 2022-11-23T01:45:38.3724918Z max_attempts: 3 2022-11-23T01:45:38.3730874Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y nvidia-docker2 sudo systemctl restart docker ) } install_nvidia_driver_amzn2() { ( set -x # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}" 2022-11-23T01:45:38.3737464Z retry_wait_seconds: 10 2022-11-23T01:45:38.3737839Z polling_interval_seconds: 1 2022-11-23T01:45:38.3738037Z warning_on_retry: true 2022-11-23T01:45:38.3738307Z continue_on_error: false 2022-11-23T01:45:38.3738531Z env: 2022-11-23T01:45:38.3738772Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:45:38.3739040Z DRIVER_VERSION: 515.76 2022-11-23T01:45:38.3739268Z ##[endgroup] 2022-11-23T01:45:38.4342636Z 2022-11-23T01:45:38.4367274Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:45:38.4416872Z == Installing nvidia driver NVIDIA-Linux-x86_64-515.76.run == 2022-11-23T01:45:38.4418019Z + sudo yum remove -y nvidia-driver-latest-dkms 2022-11-23T01:45:38.9279512Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:45:38.9728306Z No Match for argument: nvidia-driver-latest-dkms 2022-11-23T01:45:39.0026512Z No Packages marked for removal 2022-11-23T01:45:39.0196179Z + echo 'Before installing NVIDIA driver' 2022-11-23T01:45:39.0196453Z + lspci 2022-11-23T01:45:39.0196763Z Before installing NVIDIA driver 2022-11-23T01:45:40.2607720Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:45:40.2608251Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:45:40.2608835Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:45:40.2609536Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:45:40.2609920Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:45:40.2610309Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:45:40.2611001Z 00:1b.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:45:40.2611463Z 00:1c.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:45:40.2611894Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:45:40.2612473Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:45:40.2613065Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:45:40.2613386Z + lsmod 2022-11-23T01:45:40.2636064Z Module Size Used by 2022-11-23T01:45:40.2636435Z xt_conntrack 16384 1 2022-11-23T01:45:40.2637012Z ipt_MASQUERADE 16384 1 2022-11-23T01:45:40.2637342Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:45:40.2637637Z nf_conntrack_netlink 49152 0 2022-11-23T01:45:40.2637949Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:45:40.2638241Z xfrm_user 45056 1 2022-11-23T01:45:40.2638532Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:45:40.2638802Z xt_addrtype 16384 2 2022-11-23T01:45:40.2639062Z iptable_filter 16384 1 2022-11-23T01:45:40.2639331Z iptable_nat 16384 1 2022-11-23T01:45:40.2639583Z nf_conntrack_ipv4 16384 3 2022-11-23T01:45:40.2639888Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:45:40.2640193Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:45:40.2640494Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:45:40.2640970Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:45:40.2641378Z br_netfilter 24576 0 2022-11-23T01:45:40.2641666Z bridge 172032 1 br_netfilter 2022-11-23T01:45:40.2646643Z stp 16384 1 bridge 2022-11-23T01:45:40.2646935Z llc 16384 2 bridge,stp 2022-11-23T01:45:40.2647232Z overlay 86016 0 2022-11-23T01:45:40.2647498Z sunrpc 393216 1 2022-11-23T01:45:40.2647819Z dm_mirror 28672 0 2022-11-23T01:45:40.2648096Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:45:40.2648410Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:45:40.2648706Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:45:40.2648986Z dax 69632 1 dm_mod 2022-11-23T01:45:40.2649247Z sb_edac 24576 0 2022-11-23T01:45:40.2649488Z crc32_pclmul 16384 0 2022-11-23T01:45:40.2649769Z ghash_clmulni_intel 16384 0 2022-11-23T01:45:40.2650034Z pcbc 16384 0 2022-11-23T01:45:40.2650275Z ata_piix 36864 0 2022-11-23T01:45:40.2650531Z aesni_intel 188416 0 2022-11-23T01:45:40.2650798Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:45:40.2651073Z libata 266240 1 ata_piix 2022-11-23T01:45:40.2651347Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:45:40.2651644Z glue_helper 16384 1 aesni_intel 2022-11-23T01:45:40.2651981Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:45:40.2652282Z mousedev 24576 0 2022-11-23T01:45:40.2658000Z pcc_cpufreq 16384 0 2022-11-23T01:45:40.2658266Z scsi_mod 245760 1 libata 2022-11-23T01:45:40.2658547Z evdev 20480 3 2022-11-23T01:45:40.2658802Z psmouse 32768 0 2022-11-23T01:45:40.2659043Z button 16384 0 2022-11-23T01:45:40.2659302Z ena 114688 0 2022-11-23T01:45:40.2659560Z xen_blkfront 49152 2 2022-11-23T01:45:40.2659802Z crc32c_intel 24576 0 2022-11-23T01:45:40.2660052Z autofs4 49152 2 2022-11-23T01:45:40.2660297Z + modinfo nvidia 2022-11-23T01:45:40.2660556Z modinfo: ERROR: Module nvidia not found. 2022-11-23T01:45:40.2660821Z + true 2022-11-23T01:45:40.2661049Z + HAS_NVIDIA_DRIVER=0 2022-11-23T01:45:40.2661522Z ++ command -v nvidia-smi 2022-11-23T01:45:40.2661830Z + '[' -x '' ']' 2022-11-23T01:45:40.2662112Z + '[' 0 -eq 0 ']' 2022-11-23T01:45:40.2662446Z + sudo yum groupinstall -y 'Development Tools' 2022-11-23T01:45:40.7677748Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:45:41.0952299Z Resolving Dependencies 2022-11-23T01:45:41.0958784Z --> Running transaction check 2022-11-23T01:45:41.0960452Z ---> Package autoconf.noarch 0:2.69-11.amzn2 will be installed 2022-11-23T01:45:41.1185739Z --> Processing Dependency: m4 >= 1.4.14 for package: autoconf-2.69-11.amzn2.noarch 2022-11-23T01:45:41.3387198Z --> Processing Dependency: perl(Data::Dumper) for package: autoconf-2.69-11.amzn2.noarch 2022-11-23T01:45:41.3388000Z ---> Package automake.noarch 0:1.13.4-3.1.amzn2 will be installed 2022-11-23T01:45:41.3436608Z --> Processing Dependency: perl(Thread::Queue) for package: automake-1.13.4-3.1.amzn2.noarch 2022-11-23T01:45:41.3443382Z --> Processing Dependency: perl(TAP::Parser) for package: automake-1.13.4-3.1.amzn2.noarch 2022-11-23T01:45:41.3454244Z ---> Package bison.x86_64 0:3.0.4-6.amzn2.0.2 will be installed 2022-11-23T01:45:41.3574202Z ---> Package byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 will be installed 2022-11-23T01:45:41.3581510Z ---> Package cscope.x86_64 0:15.8-10.amzn2.0.2 will be installed 2022-11-23T01:45:41.3627475Z --> Processing Dependency: emacs-filesystem for package: cscope-15.8-10.amzn2.0.2.x86_64 2022-11-23T01:45:41.3651741Z ---> Package ctags.x86_64 0:5.8-13.amzn2.0.2 will be installed 2022-11-23T01:45:41.3660994Z ---> Package diffstat.x86_64 0:1.57-4.amzn2.0.2 will be installed 2022-11-23T01:45:41.3670621Z ---> Package doxygen.x86_64 1:1.8.5-4.amzn2 will be installed 2022-11-23T01:45:41.3772328Z ---> Package elfutils.x86_64 0:0.176-2.amzn2 will be installed 2022-11-23T01:45:41.3910012Z ---> Package flex.x86_64 0:2.5.37-3.amzn2.0.3 will be installed 2022-11-23T01:45:41.3928414Z ---> Package gcc.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.4107789Z --> Processing Dependency: cpp = 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4127906Z --> Processing Dependency: libsanitizer >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4184152Z --> Processing Dependency: libquadmath >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4238272Z --> Processing Dependency: libmpx >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4296038Z --> Processing Dependency: libitm >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4351441Z --> Processing Dependency: libcilkrts >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4405501Z --> Processing Dependency: libatomic >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4462531Z --> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4632492Z --> Processing Dependency: libmpfr.so.4()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4654368Z --> Processing Dependency: libmpc.so.3()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4677100Z ---> Package gcc-c++.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.4705775Z ---> Package gcc-gfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.4739315Z --> Processing Dependency: libgfortran.so.4()(64bit) for package: gcc-gfortran-7.3.1-15.amzn2.x86_64 2022-11-23T01:45:41.4803160Z ---> Package indent.x86_64 0:2.2.11-13.amzn2.0.2 will be installed 2022-11-23T01:45:41.4819205Z ---> Package intltool.noarch 0:0.50.2-7.amzn2 will be installed 2022-11-23T01:45:41.4873554Z --> Processing Dependency: perl(XML::Parser) for package: intltool-0.50.2-7.amzn2.noarch 2022-11-23T01:45:41.4890397Z --> Processing Dependency: gettext-devel for package: intltool-0.50.2-7.amzn2.noarch 2022-11-23T01:45:41.4909560Z ---> Package libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed 2022-11-23T01:45:41.4939663Z ---> Package patch.x86_64 0:2.7.1-12.amzn2.0.2 will be installed 2022-11-23T01:45:41.4976167Z ---> Package patchutils.x86_64 0:0.3.3-4.amzn2.0.1 will be installed 2022-11-23T01:45:41.5002788Z ---> Package rcs.x86_64 0:5.9.0-5.amzn2.0.2 will be installed 2022-11-23T01:45:41.5037043Z ---> Package rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-11-23T01:45:41.5285302Z --> Processing Dependency: /usr/bin/gdb-add-index for package: rpm-build-4.11.3-48.amzn2.0.2.x86_64 2022-11-23T01:45:41.5303613Z ---> Package rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-11-23T01:45:41.5327699Z ---> Package subversion.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-11-23T01:45:41.5504028Z --> Processing Dependency: subversion-libs(x86-64) = 1.7.14-16.amzn2.0.1 for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5526149Z --> Processing Dependency: libsvn_wc-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5526840Z --> Processing Dependency: libsvn_subr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5527462Z --> Processing Dependency: libsvn_repos-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5528084Z --> Processing Dependency: libsvn_ra_svn-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5528692Z --> Processing Dependency: libsvn_ra_neon-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5529349Z --> Processing Dependency: libsvn_ra_local-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5529982Z --> Processing Dependency: libsvn_ra-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5530597Z --> Processing Dependency: libsvn_fs_util-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5531215Z --> Processing Dependency: libsvn_fs_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5531835Z --> Processing Dependency: libsvn_fs_base-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5532430Z --> Processing Dependency: libsvn_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5532959Z --> Processing Dependency: libsvn_diff-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5533677Z --> Processing Dependency: libsvn_delta-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5534208Z --> Processing Dependency: libsvn_client-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5534801Z --> Processing Dependency: libneon.so.27()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5553435Z --> Processing Dependency: libaprutil-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5575839Z --> Processing Dependency: libapr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:45:41.5599107Z ---> Package swig.x86_64 0:3.0.12-11.amzn2.0.3 will be installed 2022-11-23T01:45:41.5621845Z ---> Package system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 will be installed 2022-11-23T01:45:41.5668737Z --> Processing Dependency: dwz >= 0.4 for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:45:41.5684454Z --> Processing Dependency: perl-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:45:41.5696607Z --> Processing Dependency: go-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:45:41.5878860Z ---> Package systemtap.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:45:41.5893226Z --> Processing Dependency: systemtap-devel = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.5910151Z --> Processing Dependency: systemtap-client = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.5923029Z --> Running transaction check 2022-11-23T01:45:41.5924448Z ---> Package apr.x86_64 0:1.7.0-9.amzn2 will be installed 2022-11-23T01:45:41.6009816Z ---> Package apr-util.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-11-23T01:45:41.6049303Z --> Processing Dependency: apr-util-bdb(x86-64) = 1.6.1-5.amzn2.0.2 for package: apr-util-1.6.1-5.amzn2.0.2.x86_64 2022-11-23T01:45:41.6063961Z ---> Package cpp.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6140340Z ---> Package dwz.x86_64 0:0.11-3.amzn2.0.3 will be installed 2022-11-23T01:45:41.6151491Z ---> Package emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 will be installed 2022-11-23T01:45:41.6151994Z ---> Package gdb.x86_64 0:8.0.1-36.amzn2.0.1 will be installed 2022-11-23T01:45:41.6224563Z ---> Package gettext-devel.x86_64 0:0.19.8.1-3.amzn2 will be installed 2022-11-23T01:45:41.6284773Z --> Processing Dependency: gettext-common-devel = 0.19.8.1-3.amzn2 for package: gettext-devel-0.19.8.1-3.amzn2.x86_64 2022-11-23T01:45:41.6294065Z ---> Package glibc-devel.x86_64 0:2.26-62.amzn2 will be installed 2022-11-23T01:45:41.6418694Z --> Processing Dependency: glibc-headers = 2.26-62.amzn2 for package: glibc-devel-2.26-62.amzn2.x86_64 2022-11-23T01:45:41.6449645Z --> Processing Dependency: glibc-headers for package: glibc-devel-2.26-62.amzn2.x86_64 2022-11-23T01:45:41.6450219Z ---> Package go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.2 will be installed 2022-11-23T01:45:41.6452699Z ---> Package libatomic.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6468491Z ---> Package libcilkrts.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6495665Z ---> Package libgfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6534098Z ---> Package libitm.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6549319Z ---> Package libmpc.x86_64 0:1.0.1-3.amzn2.0.2 will be installed 2022-11-23T01:45:41.6561038Z ---> Package libmpx.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6576073Z ---> Package libquadmath.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6603251Z ---> Package libsanitizer.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:45:41.6648260Z ---> Package m4.x86_64 0:1.4.16-10.amzn2.0.2 will be installed 2022-11-23T01:45:41.6662700Z ---> Package mpfr.x86_64 0:3.1.1-4.amzn2.0.2 will be installed 2022-11-23T01:45:41.6684543Z ---> Package neon.x86_64 0:0.30.0-3.amzn2.0.2 will be installed 2022-11-23T01:45:41.6762909Z --> Processing Dependency: libgnutls.so.28(GNUTLS_2_12)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:45:41.6802790Z --> Processing Dependency: libgnutls.so.28(GNUTLS_1_4)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:45:41.6803416Z --> Processing Dependency: libproxy.so.1()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:45:41.6824609Z --> Processing Dependency: libpakchois.so.0()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:45:41.6840462Z --> Processing Dependency: libgnutls.so.28()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:45:41.6847162Z ---> Package perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 will be installed 2022-11-23T01:45:41.6895987Z ---> Package perl-Test-Harness.noarch 0:3.28-3.amzn2 will be installed 2022-11-23T01:45:41.6993665Z ---> Package perl-Thread-Queue.noarch 0:3.02-2.amzn2 will be installed 2022-11-23T01:45:41.7005284Z ---> Package perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 will be installed 2022-11-23T01:45:41.7019943Z ---> Package perl-srpm-macros.noarch 0:1-8.amzn2.0.1 will be installed 2022-11-23T01:45:41.7020504Z ---> Package subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-11-23T01:45:41.7050441Z ---> Package systemtap-client.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:45:41.7262786Z --> Processing Dependency: mokutil for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.7276597Z --> Processing Dependency: libavahi-common.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.7303800Z --> Processing Dependency: libavahi-client.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.7304739Z ---> Package systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:45:41.7424298Z --> Processing Dependency: kernel-devel-uname-r for package: systemtap-devel-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:45:41.8488517Z --> Running transaction check 2022-11-23T01:45:41.8489059Z ---> Package apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-11-23T01:45:41.8500041Z ---> Package avahi-libs.x86_64 0:0.6.31-20.amzn2 will be installed 2022-11-23T01:45:41.8526287Z ---> Package gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 will be installed 2022-11-23T01:45:41.8526850Z ---> Package glibc-headers.x86_64 0:2.26-62.amzn2 will be installed 2022-11-23T01:45:41.8604627Z --> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers-2.26-62.amzn2.x86_64 2022-11-23T01:45:41.9741646Z --> Processing Dependency: kernel-headers for package: glibc-headers-2.26-62.amzn2.x86_64 2022-11-23T01:45:41.9742184Z ---> Package gnutls.x86_64 0:3.3.29-9.amzn2.0.1 will be installed 2022-11-23T01:45:41.9808451Z --> Processing Dependency: trousers >= 0.3.11.2 for package: gnutls-3.3.29-9.amzn2.0.1.x86_64 2022-11-23T01:45:41.9837748Z ---> Package kernel-devel.x86_64 0:4.14.296-222.539.amzn2 will be installed 2022-11-23T01:45:41.9864423Z --> Processing Dependency: elfutils-libelf-devel for package: kernel-devel-4.14.296-222.539.amzn2.x86_64 2022-11-23T01:45:41.9886735Z ---> Package libproxy.x86_64 0:0.4.11-10.amzn2.0.3 will be installed 2022-11-23T01:45:41.9915537Z --> Processing Dependency: libmodman.so.1()(64bit) for package: libproxy-0.4.11-10.amzn2.0.3.x86_64 2022-11-23T01:45:41.9931825Z ---> Package mokutil.x86_64 1:0.3.0-10.amzn2.0.1 will be installed 2022-11-23T01:45:41.9982279Z --> Processing Dependency: libefivar.so.1(libefivar.so.0)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:45:42.0003161Z --> Processing Dependency: libefivar.so.1(LIBEFIVAR_0.24)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:45:42.0003850Z --> Processing Dependency: libefivar.so.1()(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:45:42.0004297Z ---> Package pakchois.x86_64 0:0.4-10.amzn2.0.2 will be installed 2022-11-23T01:45:42.0018974Z --> Running transaction check 2022-11-23T01:45:42.0019481Z ---> Package efivar-libs.x86_64 0:31-4.amzn2.0.4 will be installed 2022-11-23T01:45:42.0038042Z ---> Package elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 will be installed 2022-11-23T01:45:42.0049806Z --> Processing Dependency: pkgconfig(zlib) for package: elfutils-libelf-devel-0.176-2.amzn2.x86_64 2022-11-23T01:45:42.0078100Z ---> Package kernel-headers.x86_64 0:4.14.296-222.539.amzn2 will be installed 2022-11-23T01:45:42.0078610Z ---> Package libmodman.x86_64 0:2.0.1-8.amzn2.0.2 will be installed 2022-11-23T01:45:42.0098416Z ---> Package trousers.x86_64 0:0.3.14-2.amzn2.0.2 will be installed 2022-11-23T01:45:42.0156332Z --> Running transaction check 2022-11-23T01:45:42.0156860Z ---> Package zlib-devel.x86_64 0:1.2.7-19.amzn2.0.2 will be installed 2022-11-23T01:45:42.2855607Z --> Finished Dependency Resolution 2022-11-23T01:45:42.3642483Z 2022-11-23T01:45:42.3642805Z Dependencies Resolved 2022-11-23T01:45:42.3761790Z 2022-11-23T01:45:42.3762124Z ================================================================================ 2022-11-23T01:45:42.3762639Z Package Arch Version Repository Size 2022-11-23T01:45:42.3763311Z ================================================================================ 2022-11-23T01:45:42.3763657Z Installing for group install "Development Tools": 2022-11-23T01:45:42.3764192Z autoconf noarch 2.69-11.amzn2 amzn2-core 701 k 2022-11-23T01:45:42.3764747Z automake noarch 1.13.4-3.1.amzn2 amzn2-core 679 k 2022-11-23T01:45:42.3765181Z bison x86_64 3.0.4-6.amzn2.0.2 amzn2-core 674 k 2022-11-23T01:45:42.3765775Z byacc x86_64 1.9.20130304-3.amzn2.0.2 amzn2-core 66 k 2022-11-23T01:45:42.3766258Z cscope x86_64 15.8-10.amzn2.0.2 amzn2-core 204 k 2022-11-23T01:45:42.3766682Z ctags x86_64 5.8-13.amzn2.0.2 amzn2-core 157 k 2022-11-23T01:45:42.3767024Z diffstat x86_64 1.57-4.amzn2.0.2 amzn2-core 35 k 2022-11-23T01:45:42.3767466Z doxygen x86_64 1:1.8.5-4.amzn2 amzn2-core 3.5 M 2022-11-23T01:45:42.3768067Z elfutils x86_64 0.176-2.amzn2 amzn2-core 307 k 2022-11-23T01:45:42.3768402Z flex x86_64 2.5.37-3.amzn2.0.3 amzn2-core 291 k 2022-11-23T01:45:42.3768845Z gcc x86_64 7.3.1-15.amzn2 amzn2-core 22 M 2022-11-23T01:45:42.3769285Z gcc-c++ x86_64 7.3.1-15.amzn2 amzn2-core 13 M 2022-11-23T01:45:42.3769713Z gcc-gfortran x86_64 7.3.1-15.amzn2 amzn2-core 11 M 2022-11-23T01:45:42.3770169Z indent x86_64 2.2.11-13.amzn2.0.2 amzn2-core 150 k 2022-11-23T01:45:42.3770608Z intltool noarch 0.50.2-7.amzn2 amzn2-core 59 k 2022-11-23T01:45:42.3771203Z libtool x86_64 2.4.2-22.2.amzn2.0.2 amzn2-core 588 k 2022-11-23T01:45:42.3771609Z patch x86_64 2.7.1-12.amzn2.0.2 amzn2-core 110 k 2022-11-23T01:45:42.3772045Z patchutils x86_64 0.3.3-4.amzn2.0.1 amzn2-core 104 k 2022-11-23T01:45:42.3772476Z rcs x86_64 5.9.0-5.amzn2.0.2 amzn2-core 231 k 2022-11-23T01:45:42.3772947Z rpm-build x86_64 4.11.3-48.amzn2.0.2 amzn2-core 150 k 2022-11-23T01:45:42.3773316Z rpm-sign x86_64 4.11.3-48.amzn2.0.2 amzn2-core 50 k 2022-11-23T01:45:42.3773754Z subversion x86_64 1.7.14-16.amzn2.0.1 amzn2-core 1.0 M 2022-11-23T01:45:42.3774177Z swig x86_64 3.0.12-11.amzn2.0.3 amzn2-core 1.4 M 2022-11-23T01:45:42.3774601Z system-rpm-config noarch 9.1.0-76.amzn2.0.14 amzn2-core 90 k 2022-11-23T01:45:42.3775047Z systemtap x86_64 4.5-1.amzn2.0.1 amzn2-core 12 k 2022-11-23T01:45:42.3775366Z Installing for dependencies: 2022-11-23T01:45:42.3775750Z apr x86_64 1.7.0-9.amzn2 amzn2-core 122 k 2022-11-23T01:45:42.3776181Z apr-util x86_64 1.6.1-5.amzn2.0.2 amzn2-core 99 k 2022-11-23T01:45:42.3776620Z apr-util-bdb x86_64 1.6.1-5.amzn2.0.2 amzn2-core 19 k 2022-11-23T01:45:42.3777059Z avahi-libs x86_64 0.6.31-20.amzn2 amzn2-core 61 k 2022-11-23T01:45:42.3777464Z cpp x86_64 7.3.1-15.amzn2 amzn2-core 9.2 M 2022-11-23T01:45:42.3777882Z dwz x86_64 0.11-3.amzn2.0.3 amzn2-core 98 k 2022-11-23T01:45:42.3778490Z efivar-libs x86_64 31-4.amzn2.0.4 amzn2-core 68 k 2022-11-23T01:45:42.3778935Z elfutils-libelf-devel x86_64 0.176-2.amzn2 amzn2-core 40 k 2022-11-23T01:45:42.3779415Z emacs-filesystem noarch 1:27.2-4.amzn2.0.1 amzn2-core 67 k 2022-11-23T01:45:42.3779864Z gdb x86_64 8.0.1-36.amzn2.0.1 amzn2-core 3.1 M 2022-11-23T01:45:42.3780322Z gettext-common-devel noarch 0.19.8.1-3.amzn2 amzn2-core 410 k 2022-11-23T01:45:42.3780779Z gettext-devel x86_64 0.19.8.1-3.amzn2 amzn2-core 320 k 2022-11-23T01:45:42.3781271Z glibc-devel x86_64 2.26-62.amzn2 amzn2-core 995 k 2022-11-23T01:45:42.3781816Z glibc-headers x86_64 2.26-62.amzn2 amzn2-core 516 k 2022-11-23T01:45:42.3782263Z gnutls x86_64 3.3.29-9.amzn2.0.1 amzn2-core 661 k 2022-11-23T01:45:42.3782663Z go-srpm-macros noarch 3.0.15-23.amzn2.0.2 amzn2-core 23 k 2022-11-23T01:45:42.3783142Z kernel-devel x86_64 4.14.296-222.539.amzn2 amzn2-core 13 M 2022-11-23T01:45:42.3783604Z kernel-headers x86_64 4.14.296-222.539.amzn2 amzn2-core 1.2 M 2022-11-23T01:45:42.3784353Z libatomic x86_64 7.3.1-15.amzn2 amzn2-core 46 k 2022-11-23T01:45:42.3784801Z libcilkrts x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-11-23T01:45:42.3785363Z libgfortran x86_64 7.3.1-15.amzn2 amzn2-core 536 k 2022-11-23T01:45:42.3785809Z libitm x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-11-23T01:45:42.3786226Z libmodman x86_64 2.0.1-8.amzn2.0.2 amzn2-core 29 k 2022-11-23T01:45:42.3786667Z libmpc x86_64 1.0.1-3.amzn2.0.2 amzn2-core 52 k 2022-11-23T01:45:42.3787107Z libmpx x86_64 7.3.1-15.amzn2 amzn2-core 51 k 2022-11-23T01:45:42.3787522Z libproxy x86_64 0.4.11-10.amzn2.0.3 amzn2-core 61 k 2022-11-23T01:45:42.3787962Z libquadmath x86_64 7.3.1-15.amzn2 amzn2-core 189 k 2022-11-23T01:45:42.3788404Z libsanitizer x86_64 7.3.1-15.amzn2 amzn2-core 642 k 2022-11-23T01:45:42.3788834Z m4 x86_64 1.4.16-10.amzn2.0.2 amzn2-core 256 k 2022-11-23T01:45:42.3789402Z mokutil x86_64 1:0.3.0-10.amzn2.0.1 amzn2-core 39 k 2022-11-23T01:45:42.3790114Z mpfr x86_64 3.1.1-4.amzn2.0.2 amzn2-core 208 k 2022-11-23T01:45:42.3790544Z neon x86_64 0.30.0-3.amzn2.0.2 amzn2-core 166 k 2022-11-23T01:45:42.3790958Z pakchois x86_64 0.4-10.amzn2.0.2 amzn2-core 14 k 2022-11-23T01:45:42.3791421Z perl-Data-Dumper x86_64 2.145-3.amzn2.0.2 amzn2-core 48 k 2022-11-23T01:45:42.3791897Z perl-Test-Harness noarch 3.28-3.amzn2 amzn2-core 302 k 2022-11-23T01:45:42.3792535Z perl-Thread-Queue noarch 3.02-2.amzn2 amzn2-core 17 k 2022-11-23T01:45:42.3793149Z perl-XML-Parser x86_64 2.41-10.amzn2.0.2 amzn2-core 223 k 2022-11-23T01:45:42.3793620Z perl-srpm-macros noarch 1-8.amzn2.0.1 amzn2-core 4.7 k 2022-11-23T01:45:42.3794089Z subversion-libs x86_64 1.7.14-16.amzn2.0.1 amzn2-core 912 k 2022-11-23T01:45:42.3794530Z systemtap-client x86_64 4.5-1.amzn2.0.1 amzn2-core 3.7 M 2022-11-23T01:45:42.3794990Z systemtap-devel x86_64 4.5-1.amzn2.0.1 amzn2-core 2.3 M 2022-11-23T01:45:42.3795439Z trousers x86_64 0.3.14-2.amzn2.0.2 amzn2-core 294 k 2022-11-23T01:45:42.3796032Z zlib-devel x86_64 1.2.7-19.amzn2.0.2 amzn2-core 50 k 2022-11-23T01:45:42.3796217Z 2022-11-23T01:45:42.3796341Z Transaction Summary 2022-11-23T01:45:42.3796633Z ================================================================================ 2022-11-23T01:45:42.3796955Z Install 25 Packages (+43 Dependent packages) 2022-11-23T01:45:42.3797150Z 2022-11-23T01:45:42.3797249Z Total download size: 96 M 2022-11-23T01:45:42.3797515Z Installed size: 303 M 2022-11-23T01:45:42.3797788Z Downloading packages: 2022-11-23T01:45:42.3813935Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-11-23T01:45:44.2713994Z -------------------------------------------------------------------------------- 2022-11-23T01:45:44.2714492Z Total 51 MB/s | 96 MB 00:01 2022-11-23T01:45:44.3806973Z Running transaction check 2022-11-23T01:45:44.4592538Z Running transaction test 2022-11-23T01:45:46.8526295Z Transaction test succeeded 2022-11-23T01:45:46.8528510Z Running transaction 2022-11-23T01:45:52.1362018Z Installing : mpfr-3.1.1-4.amzn2.0.2.x86_64 1/68 2022-11-23T01:45:54.4100247Z Installing : libmpc-1.0.1-3.amzn2.0.2.x86_64 2/68 2022-11-23T01:45:56.8431231Z Installing : m4-1.4.16-10.amzn2.0.2.x86_64 3/68 2022-11-23T01:45:59.3059000Z Installing : apr-1.7.0-9.amzn2.x86_64 4/68 2022-11-23T01:46:01.7280195Z Installing : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 5/68 2022-11-23T01:46:04.1978848Z Installing : apr-util-1.6.1-5.amzn2.0.2.x86_64 6/68 2022-11-23T01:46:05.3409409Z Installing : avahi-libs-0.6.31-20.amzn2.x86_64 7/68 2022-11-23T01:46:05.3870649Z Installing : libquadmath-7.3.1-15.amzn2.x86_64 8/68 2022-11-23T01:46:05.4185639Z Installing : patch-2.7.1-12.amzn2.0.2.x86_64 9/68 2022-11-23T01:46:05.5101172Z Installing : perl-Thread-Queue-3.02-2.amzn2.noarch 10/68 2022-11-23T01:46:06.6052833Z Installing : libgfortran-7.3.1-15.amzn2.x86_64 11/68 2022-11-23T01:46:06.6575416Z Installing : cpp-7.3.1-15.amzn2.x86_64 12/68 2022-11-23T01:46:06.7059178Z Installing : libmodman-2.0.1-8.amzn2.0.2.x86_64 13/68 2022-11-23T01:46:06.7772680Z Installing : libproxy-0.4.11-10.amzn2.0.3.x86_64 14/68 2022-11-23T01:46:06.8435922Z Installing : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 15/68 2022-11-23T01:46:06.9653700Z Installing : elfutils-0.176-2.amzn2.x86_64 16/68 2022-11-23T01:46:06.9993145Z Installing : libsanitizer-7.3.1-15.amzn2.x86_64 17/68 2022-11-23T01:46:07.0277514Z Installing : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 18/68 2022-11-23T01:46:07.0631698Z Installing : efivar-libs-31-4.amzn2.0.4.x86_64 19/68 2022-11-23T01:46:07.0962183Z Installing : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 20/68 2022-11-23T01:46:07.1806050Z Installing : gettext-common-devel-0.19.8.1-3.amzn2.noarch 21/68 2022-11-23T01:46:07.2392691Z Installing : gettext-devel-0.19.8.1-3.amzn2.x86_64 22/68 2022-11-23T01:46:07.3395013Z Installing : dwz-0.11-3.amzn2.0.3.x86_64 23/68 2022-11-23T01:46:07.5168139Z Installing : trousers-0.3.14-2.amzn2.0.2.x86_64 24/68 2022-11-23T01:46:07.5651360Z Installing : gnutls-3.3.29-9.amzn2.0.1.x86_64 25/68 2022-11-23T01:46:07.9809833Z Installing : libitm-7.3.1-15.amzn2.x86_64 26/68 2022-11-23T01:46:08.0210872Z Installing : gdb-8.0.1-36.amzn2.0.1.x86_64 27/68 2022-11-23T01:46:08.0553522Z Installing : libmpx-7.3.1-15.amzn2.x86_64 28/68 2022-11-23T01:46:08.0766091Z Installing : perl-srpm-macros-1-8.amzn2.0.1.noarch 29/68 2022-11-23T01:46:08.1128231Z Installing : go-srpm-macros-3.0.15-23.amzn2.0.2.noarch 30/68 2022-11-23T01:46:08.1629880Z Installing : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 31/68 2022-11-23T01:46:08.2728529Z Installing : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 32/68 2022-11-23T01:46:08.3710743Z Installing : autoconf-2.69-11.amzn2.noarch 33/68 2022-11-23T01:46:08.4886748Z Installing : perl-Test-Harness-3.28-3.amzn2.noarch 34/68 2022-11-23T01:46:08.5343110Z Installing : automake-1.13.4-3.1.amzn2.noarch 35/68 2022-11-23T01:46:08.5569019Z Installing : zlib-devel-1.2.7-19.amzn2.0.2.x86_64 36/68 2022-11-23T01:46:08.5807977Z Installing : elfutils-libelf-devel-0.176-2.amzn2.x86_64 37/68 2022-11-23T01:46:08.8867882Z Installing : libatomic-7.3.1-15.amzn2.x86_64 38/68 2022-11-23T01:46:09.0708015Z Installing : kernel-headers-4.14.296-222.539.amzn2.x86_64 39/68 2022-11-23T01:46:09.2161490Z Installing : glibc-headers-2.26-62.amzn2.x86_64 40/68 2022-11-23T01:46:09.2589882Z Installing : glibc-devel-2.26-62.amzn2.x86_64 41/68 2022-11-23T01:46:11.3536983Z Installing : libcilkrts-7.3.1-15.amzn2.x86_64 42/68 2022-11-23T01:46:15.4026815Z Installing : gcc-7.3.1-15.amzn2.x86_64 43/68 2022-11-23T01:46:29.3484603Z Installing : kernel-devel-4.14.296-222.539.amzn2.x86_64 44/68 2022-11-23T01:46:29.9841137Z Installing : systemtap-devel-4.5-1.amzn2.0.1.x86_64 45/68 2022-11-23T01:46:30.0537582Z Installing : systemtap-client-4.5-1.amzn2.0.1.x86_64 46/68 2022-11-23T01:46:30.1174836Z Installing : pakchois-0.4-10.amzn2.0.2.x86_64 47/68 2022-11-23T01:46:30.2617520Z Installing : neon-0.30.0-3.amzn2.0.2.x86_64 48/68 2022-11-23T01:46:30.4541356Z Installing : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 49/68 2022-11-23T01:46:30.5584513Z Installing : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-11-23T01:46:31.8052564Z Installing : systemtap-4.5-1.amzn2.0.1.x86_64 51/68 2022-11-23T01:46:33.4854626Z Installing : gcc-gfortran-7.3.1-15.amzn2.x86_64 52/68 2022-11-23T01:46:33.6121356Z Installing : gcc-c++-7.3.1-15.amzn2.x86_64 53/68 2022-11-23T01:46:33.6571685Z Installing : libtool-2.4.2-22.2.amzn2.0.2.x86_64 54/68 2022-11-23T01:46:33.7056994Z Installing : intltool-0.50.2-7.amzn2.noarch 55/68 2022-11-23T01:46:33.7922958Z Installing : rpm-build-4.11.3-48.amzn2.0.2.x86_64 56/68 2022-11-23T01:46:33.8703983Z Installing : cscope-15.8-10.amzn2.0.2.x86_64 57/68 2022-11-23T01:46:33.9956522Z Installing : flex-2.5.37-3.amzn2.0.3.x86_64 58/68 2022-11-23T01:46:34.0784223Z Installing : bison-3.0.4-6.amzn2.0.2.x86_64 59/68 2022-11-23T01:46:34.1416494Z Installing : rcs-5.9.0-5.amzn2.0.2.x86_64 60/68 2022-11-23T01:46:34.1997206Z Installing : ctags-5.8-13.amzn2.0.2.x86_64 61/68 2022-11-23T01:46:34.2813319Z Installing : indent-2.2.11-13.amzn2.0.2.x86_64 62/68 2022-11-23T01:46:35.0195571Z Installing : patchutils-0.3.3-4.amzn2.0.1.x86_64 63/68 2022-11-23T01:46:35.0743337Z Installing : 1:doxygen-1.8.5-4.amzn2.x86_64 64/68 2022-11-23T01:46:35.1145396Z Installing : diffstat-1.57-4.amzn2.0.2.x86_64 65/68 2022-11-23T01:46:35.4494249Z Installing : byacc-1.9.20130304-3.amzn2.0.2.x86_64 66/68 2022-11-23T01:46:35.4981374Z Installing : swig-3.0.12-11.amzn2.0.3.x86_64 67/68 2022-11-23T01:46:35.5686038Z Installing : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 68/68 2022-11-23T01:46:35.5855744Z Verifying : elfutils-libelf-devel-0.176-2.amzn2.x86_64 1/68 2022-11-23T01:46:35.6000495Z Verifying : perl-Thread-Queue-3.02-2.amzn2.noarch 2/68 2022-11-23T01:46:35.6109330Z Verifying : gettext-devel-0.19.8.1-3.amzn2.x86_64 3/68 2022-11-23T01:46:35.6256356Z Verifying : patch-2.7.1-12.amzn2.0.2.x86_64 4/68 2022-11-23T01:46:35.6413769Z Verifying : kernel-devel-4.14.296-222.539.amzn2.x86_64 5/68 2022-11-23T01:46:35.6537376Z Verifying : flex-2.5.37-3.amzn2.0.3.x86_64 6/68 2022-11-23T01:46:35.6652286Z Verifying : pakchois-0.4-10.amzn2.0.2.x86_64 7/68 2022-11-23T01:46:35.6791304Z Verifying : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 8/68 2022-11-23T01:46:35.6916023Z Verifying : glibc-devel-2.26-62.amzn2.x86_64 9/68 2022-11-23T01:46:35.7016472Z Verifying : gcc-gfortran-7.3.1-15.amzn2.x86_64 10/68 2022-11-23T01:46:35.7113386Z Verifying : swig-3.0.12-11.amzn2.0.3.x86_64 11/68 2022-11-23T01:46:35.7242632Z Verifying : byacc-1.9.20130304-3.amzn2.0.2.x86_64 12/68 2022-11-23T01:46:35.7344642Z Verifying : libmpc-1.0.1-3.amzn2.0.2.x86_64 13/68 2022-11-23T01:46:35.7451979Z Verifying : libcilkrts-7.3.1-15.amzn2.x86_64 14/68 2022-11-23T01:46:35.7555169Z Verifying : kernel-headers-4.14.296-222.539.amzn2.x86_64 15/68 2022-11-23T01:46:35.7686698Z Verifying : libproxy-0.4.11-10.amzn2.0.3.x86_64 16/68 2022-11-23T01:46:35.7810354Z Verifying : cscope-15.8-10.amzn2.0.2.x86_64 17/68 2022-11-23T01:46:35.7911836Z Verifying : diffstat-1.57-4.amzn2.0.2.x86_64 18/68 2022-11-23T01:46:35.8035363Z Verifying : 1:doxygen-1.8.5-4.amzn2.x86_64 19/68 2022-11-23T01:46:35.8141939Z Verifying : gcc-c++-7.3.1-15.amzn2.x86_64 20/68 2022-11-23T01:46:35.8268518Z Verifying : libatomic-7.3.1-15.amzn2.x86_64 21/68 2022-11-23T01:46:35.8408647Z Verifying : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 22/68 2022-11-23T01:46:35.8525405Z Verifying : systemtap-devel-4.5-1.amzn2.0.1.x86_64 23/68 2022-11-23T01:46:35.8645487Z Verifying : zlib-devel-1.2.7-19.amzn2.0.2.x86_64 24/68 2022-11-23T01:46:35.8749853Z Verifying : glibc-headers-2.26-62.amzn2.x86_64 25/68 2022-11-23T01:46:35.8858907Z Verifying : perl-Test-Harness-3.28-3.amzn2.noarch 26/68 2022-11-23T01:46:35.8972122Z Verifying : autoconf-2.69-11.amzn2.noarch 27/68 2022-11-23T01:46:35.9095609Z Verifying : libquadmath-7.3.1-15.amzn2.x86_64 28/68 2022-11-23T01:46:35.9207821Z Verifying : intltool-0.50.2-7.amzn2.noarch 29/68 2022-11-23T01:46:35.9317285Z Verifying : apr-util-1.6.1-5.amzn2.0.2.x86_64 30/68 2022-11-23T01:46:35.9429480Z Verifying : cpp-7.3.1-15.amzn2.x86_64 31/68 2022-11-23T01:46:35.9549119Z Verifying : rpm-build-4.11.3-48.amzn2.0.2.x86_64 32/68 2022-11-23T01:46:35.9677413Z Verifying : go-srpm-macros-3.0.15-23.amzn2.0.2.noarch 33/68 2022-11-23T01:46:35.9783737Z Verifying : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 34/68 2022-11-23T01:46:35.9908062Z Verifying : perl-srpm-macros-1-8.amzn2.0.1.noarch 35/68 2022-11-23T01:46:36.0041557Z Verifying : gnutls-3.3.29-9.amzn2.0.1.x86_64 36/68 2022-11-23T01:46:36.0176344Z Verifying : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 37/68 2022-11-23T01:46:36.0337084Z Verifying : automake-1.13.4-3.1.amzn2.noarch 38/68 2022-11-23T01:46:36.0470396Z Verifying : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 39/68 2022-11-23T01:46:36.0578990Z Verifying : libmpx-7.3.1-15.amzn2.x86_64 40/68 2022-11-23T01:46:36.0735115Z Verifying : avahi-libs-0.6.31-20.amzn2.x86_64 41/68 2022-11-23T01:46:36.0892663Z Verifying : bison-3.0.4-6.amzn2.0.2.x86_64 42/68 2022-11-23T01:46:36.1005743Z Verifying : libgfortran-7.3.1-15.amzn2.x86_64 43/68 2022-11-23T01:46:36.1159373Z Verifying : gdb-8.0.1-36.amzn2.0.1.x86_64 44/68 2022-11-23T01:46:36.1264444Z Verifying : patchutils-0.3.3-4.amzn2.0.1.x86_64 45/68 2022-11-23T01:46:36.1438521Z Verifying : libitm-7.3.1-15.amzn2.x86_64 46/68 2022-11-23T01:46:36.1586925Z Verifying : libtool-2.4.2-22.2.amzn2.0.2.x86_64 47/68 2022-11-23T01:46:36.1717438Z Verifying : gcc-7.3.1-15.amzn2.x86_64 48/68 2022-11-23T01:46:36.1867769Z Verifying : indent-2.2.11-13.amzn2.0.2.x86_64 49/68 2022-11-23T01:46:36.2018732Z Verifying : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-11-23T01:46:36.2169910Z Verifying : apr-1.7.0-9.amzn2.x86_64 51/68 2022-11-23T01:46:36.2339546Z Verifying : ctags-5.8-13.amzn2.0.2.x86_64 52/68 2022-11-23T01:46:36.2468358Z Verifying : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 53/68 2022-11-23T01:46:36.2623865Z Verifying : mpfr-3.1.1-4.amzn2.0.2.x86_64 54/68 2022-11-23T01:46:36.2770533Z Verifying : trousers-0.3.14-2.amzn2.0.2.x86_64 55/68 2022-11-23T01:46:36.2903085Z Verifying : neon-0.30.0-3.amzn2.0.2.x86_64 56/68 2022-11-23T01:46:36.3087861Z Verifying : systemtap-4.5-1.amzn2.0.1.x86_64 57/68 2022-11-23T01:46:36.3255439Z Verifying : dwz-0.11-3.amzn2.0.3.x86_64 58/68 2022-11-23T01:46:36.3399824Z Verifying : gettext-common-devel-0.19.8.1-3.amzn2.noarch 59/68 2022-11-23T01:46:36.3532087Z Verifying : systemtap-client-4.5-1.amzn2.0.1.x86_64 60/68 2022-11-23T01:46:36.3653942Z Verifying : efivar-libs-31-4.amzn2.0.4.x86_64 61/68 2022-11-23T01:46:36.3766911Z Verifying : rcs-5.9.0-5.amzn2.0.2.x86_64 62/68 2022-11-23T01:46:36.3895774Z Verifying : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 63/68 2022-11-23T01:46:36.4020545Z Verifying : libsanitizer-7.3.1-15.amzn2.x86_64 64/68 2022-11-23T01:46:36.4132571Z Verifying : elfutils-0.176-2.amzn2.x86_64 65/68 2022-11-23T01:46:36.4234427Z Verifying : m4-1.4.16-10.amzn2.0.2.x86_64 66/68 2022-11-23T01:46:36.4353559Z Verifying : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 67/68 2022-11-23T01:46:36.6051735Z Verifying : libmodman-2.0.1-8.amzn2.0.2.x86_64 68/68 2022-11-23T01:46:36.6052048Z 2022-11-23T01:46:36.6052158Z Installed: 2022-11-23T01:46:36.6052556Z autoconf.noarch 0:2.69-11.amzn2 2022-11-23T01:46:36.6052983Z automake.noarch 0:1.13.4-3.1.amzn2 2022-11-23T01:46:36.6053428Z bison.x86_64 0:3.0.4-6.amzn2.0.2 2022-11-23T01:46:36.6060078Z byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 2022-11-23T01:46:36.6060561Z cscope.x86_64 0:15.8-10.amzn2.0.2 2022-11-23T01:46:36.6061011Z ctags.x86_64 0:5.8-13.amzn2.0.2 2022-11-23T01:46:36.6061438Z diffstat.x86_64 0:1.57-4.amzn2.0.2 2022-11-23T01:46:36.6061842Z doxygen.x86_64 1:1.8.5-4.amzn2 2022-11-23T01:46:36.6066187Z elfutils.x86_64 0:0.176-2.amzn2 2022-11-23T01:46:36.6067238Z flex.x86_64 0:2.5.37-3.amzn2.0.3 2022-11-23T01:46:36.6067937Z gcc.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6068585Z gcc-c++.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6069038Z gcc-gfortran.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6069750Z indent.x86_64 0:2.2.11-13.amzn2.0.2 2022-11-23T01:46:36.6070181Z intltool.noarch 0:0.50.2-7.amzn2 2022-11-23T01:46:36.6070555Z libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 2022-11-23T01:46:36.6070974Z patch.x86_64 0:2.7.1-12.amzn2.0.2 2022-11-23T01:46:36.6071399Z patchutils.x86_64 0:0.3.3-4.amzn2.0.1 2022-11-23T01:46:36.6071939Z rcs.x86_64 0:5.9.0-5.amzn2.0.2 2022-11-23T01:46:36.6072347Z rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 2022-11-23T01:46:36.6072780Z rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 2022-11-23T01:46:36.6073323Z subversion.x86_64 0:1.7.14-16.amzn2.0.1 2022-11-23T01:46:36.6073754Z swig.x86_64 0:3.0.12-11.amzn2.0.3 2022-11-23T01:46:36.6074128Z system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 2022-11-23T01:46:36.6074583Z systemtap.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:46:36.6074786Z 2022-11-23T01:46:36.6074913Z Dependency Installed: 2022-11-23T01:46:36.6075314Z apr.x86_64 0:1.7.0-9.amzn2 2022-11-23T01:46:36.6075733Z apr-util.x86_64 0:1.6.1-5.amzn2.0.2 2022-11-23T01:46:36.6076198Z apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 2022-11-23T01:46:36.6076724Z avahi-libs.x86_64 0:0.6.31-20.amzn2 2022-11-23T01:46:36.6077223Z cpp.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6077557Z dwz.x86_64 0:0.11-3.amzn2.0.3 2022-11-23T01:46:36.6077979Z efivar-libs.x86_64 0:31-4.amzn2.0.4 2022-11-23T01:46:36.6078435Z elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 2022-11-23T01:46:36.6078884Z emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 2022-11-23T01:46:36.6079328Z gdb.x86_64 0:8.0.1-36.amzn2.0.1 2022-11-23T01:46:36.6079775Z gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 2022-11-23T01:46:36.6080220Z gettext-devel.x86_64 0:0.19.8.1-3.amzn2 2022-11-23T01:46:36.6080652Z glibc-devel.x86_64 0:2.26-62.amzn2 2022-11-23T01:46:36.6081083Z glibc-headers.x86_64 0:2.26-62.amzn2 2022-11-23T01:46:36.6081510Z gnutls.x86_64 0:3.3.29-9.amzn2.0.1 2022-11-23T01:46:36.6081930Z go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.2 2022-11-23T01:46:36.6082382Z kernel-devel.x86_64 0:4.14.296-222.539.amzn2 2022-11-23T01:46:36.6082823Z kernel-headers.x86_64 0:4.14.296-222.539.amzn2 2022-11-23T01:46:36.6083232Z libatomic.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6083657Z libcilkrts.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6084088Z libgfortran.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6084507Z libitm.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6084903Z libmodman.x86_64 0:2.0.1-8.amzn2.0.2 2022-11-23T01:46:36.6085321Z libmpc.x86_64 0:1.0.1-3.amzn2.0.2 2022-11-23T01:46:36.6085899Z libmpx.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6086411Z libproxy.x86_64 0:0.4.11-10.amzn2.0.3 2022-11-23T01:46:36.6086756Z libquadmath.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6087183Z libsanitizer.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:46:36.6087671Z m4.x86_64 0:1.4.16-10.amzn2.0.2 2022-11-23T01:46:36.6088146Z mokutil.x86_64 1:0.3.0-10.amzn2.0.1 2022-11-23T01:46:36.6088569Z mpfr.x86_64 0:3.1.1-4.amzn2.0.2 2022-11-23T01:46:36.6088987Z neon.x86_64 0:0.30.0-3.amzn2.0.2 2022-11-23T01:46:36.6089386Z pakchois.x86_64 0:0.4-10.amzn2.0.2 2022-11-23T01:46:36.6089836Z perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 2022-11-23T01:46:36.6090309Z perl-Test-Harness.noarch 0:3.28-3.amzn2 2022-11-23T01:46:36.6090785Z perl-Thread-Queue.noarch 0:3.02-2.amzn2 2022-11-23T01:46:36.6091239Z perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 2022-11-23T01:46:36.6091708Z perl-srpm-macros.noarch 0:1-8.amzn2.0.1 2022-11-23T01:46:36.6092172Z subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 2022-11-23T01:46:36.6092608Z systemtap-client.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:46:36.6093054Z systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:46:36.6093485Z trousers.x86_64 0:0.3.14-2.amzn2.0.2 2022-11-23T01:46:36.6093914Z zlib-devel.x86_64 0:1.2.7-19.amzn2.0.2 2022-11-23T01:46:36.6094103Z 2022-11-23T01:46:36.6094211Z Complete! 2022-11-23T01:46:36.6490453Z ++ uname -r 2022-11-23T01:46:36.6502070Z + sudo yum install -y 'kernel-devel-uname-r == 4.14.252-195.483.amzn2.x86_64' 2022-11-23T01:46:37.1786814Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:46:37.4758317Z Resolving Dependencies 2022-11-23T01:46:37.4764893Z --> Running transaction check 2022-11-23T01:46:37.4765389Z ---> Package kernel-devel.x86_64 0:4.14.252-195.483.amzn2 will be installed 2022-11-23T01:46:37.7761242Z --> Finished Dependency Resolution 2022-11-23T01:46:37.8639665Z 2022-11-23T01:46:37.8640141Z Dependencies Resolved 2022-11-23T01:46:37.8646124Z 2022-11-23T01:46:37.8646424Z ================================================================================ 2022-11-23T01:46:37.8646935Z Package Arch Version Repository Size 2022-11-23T01:46:37.8647619Z ================================================================================ 2022-11-23T01:46:37.8648150Z Installing: 2022-11-23T01:46:37.8649042Z kernel-devel x86_64 4.14.252-195.483.amzn2 amzn2-core 13 M 2022-11-23T01:46:37.8649349Z 2022-11-23T01:46:37.8649471Z Transaction Summary 2022-11-23T01:46:37.8649775Z ================================================================================ 2022-11-23T01:46:37.8650040Z Install 1 Package 2022-11-23T01:46:37.8650197Z 2022-11-23T01:46:37.8650327Z Total download size: 13 M 2022-11-23T01:46:37.8650590Z Installed size: 60 M 2022-11-23T01:46:37.8650865Z Downloading packages: 2022-11-23T01:46:37.8660380Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-11-23T01:46:38.1699812Z Running transaction check 2022-11-23T01:46:38.1889047Z Running transaction test 2022-11-23T01:46:38.6040344Z Transaction test succeeded 2022-11-23T01:46:38.6042219Z Running transaction 2022-11-23T01:46:57.2047658Z Installing : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-11-23T01:46:57.2934044Z Verifying : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-11-23T01:46:57.2934549Z 2022-11-23T01:46:57.2934704Z Installed: 2022-11-23T01:46:57.2935436Z kernel-devel.x86_64 0:4.14.252-195.483.amzn2 2022-11-23T01:46:57.2935763Z 2022-11-23T01:46:57.2935923Z Complete! 2022-11-23T01:46:57.3288858Z + sudo modprobe backlight 2022-11-23T01:46:57.3556737Z + sudo curl -fsL -o /tmp/nvidia_driver https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-515.76.run 2022-11-23T01:47:01.0362178Z + set +e 2022-11-23T01:47:01.0362842Z + sudo /bin/bash /tmp/nvidia_driver -s --no-drm 2022-11-23T01:47:02.3956609Z Verifying archive integrity... OK 2022-11-23T01:47:29.0334186Z Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 515.76................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 2022-11-23T01:47:29.1893860Z 2022-11-23T01:47:29.1894762Z WARNING: The nvidia-drm module will not be installed. As a result, DRM-KMS will not function with this installation of the NVIDIA driver. 2022-11-23T01:47:29.1896387Z 2022-11-23T01:47:40.8482136Z 2022-11-23T01:47:40.8483734Z WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver. 2022-11-23T01:47:40.8484416Z 2022-11-23T01:47:50.9571785Z + NVIDIA_INSTALLATION_STATUS=0 2022-11-23T01:47:50.9572146Z + RESET_GPU=0 2022-11-23T01:47:50.9572713Z + '[' 0 -ne 0 ']' 2022-11-23T01:47:50.9573287Z ++ command -v nvidia-smi 2022-11-23T01:47:50.9576068Z + '[' -x /usr/bin/nvidia-smi ']' 2022-11-23T01:47:50.9581695Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2022-11-23T01:47:59.9903571Z + INSTALLED_DRIVER_VERSION=515.76 2022-11-23T01:47:59.9904327Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:47:59.9904743Z + '[' 0 -ne 0 ']' 2022-11-23T01:47:59.9905022Z + '[' 0 -eq 1 ']' 2022-11-23T01:47:59.9905366Z + sudo rm -fv /tmp/nvidia_driver 2022-11-23T01:48:00.0509053Z removed ‘/tmp/nvidia_driver’ 2022-11-23T01:48:00.0526590Z + set -e 2022-11-23T01:48:00.0527142Z + sudo modprobe nvidia 2022-11-23T01:48:00.0661565Z + echo 'After installing NVIDIA driver' 2022-11-23T01:48:00.0661871Z + lspci 2022-11-23T01:48:00.0662133Z After installing NVIDIA driver 2022-11-23T01:48:00.0916561Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:48:00.0917009Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:48:00.0917520Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:48:00.0917901Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:48:00.0918275Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:48:00.0918671Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:48:00.0919353Z 00:1b.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:48:00.0922716Z 00:1c.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:48:00.0923153Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:48:00.0923583Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:48:00.0924014Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:48:00.0924310Z + lsmod 2022-11-23T01:48:00.0941540Z Module Size Used by 2022-11-23T01:48:00.0941831Z nvidia 40808448 0 2022-11-23T01:48:00.0942119Z drm 425984 1 nvidia 2022-11-23T01:48:00.0942394Z i2c_core 77824 2 nvidia,drm 2022-11-23T01:48:00.0942666Z backlight 16384 0 2022-11-23T01:48:00.0943000Z xt_conntrack 16384 1 2022-11-23T01:48:00.0943274Z ipt_MASQUERADE 16384 1 2022-11-23T01:48:00.0943569Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:48:00.0944209Z nf_conntrack_netlink 49152 0 2022-11-23T01:48:00.0944542Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:48:00.0946304Z xfrm_user 45056 1 2022-11-23T01:48:00.0946593Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:48:00.0946881Z xt_addrtype 16384 2 2022-11-23T01:48:00.0947169Z iptable_filter 16384 1 2022-11-23T01:48:00.0947419Z iptable_nat 16384 1 2022-11-23T01:48:00.0947688Z nf_conntrack_ipv4 16384 3 2022-11-23T01:48:00.0948002Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:48:00.0948407Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:48:00.0948733Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:48:00.0949208Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:48:00.0949614Z br_netfilter 24576 0 2022-11-23T01:48:00.0949883Z bridge 172032 1 br_netfilter 2022-11-23T01:48:00.0960468Z stp 16384 1 bridge 2022-11-23T01:48:00.0960765Z llc 16384 2 bridge,stp 2022-11-23T01:48:00.0961235Z overlay 86016 0 2022-11-23T01:48:00.0961596Z sunrpc 393216 1 2022-11-23T01:48:00.0961831Z dm_mirror 28672 0 2022-11-23T01:48:00.0962109Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:48:00.0962424Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:48:00.0962726Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:48:00.0963003Z dax 69632 1 dm_mod 2022-11-23T01:48:00.0963249Z sb_edac 24576 0 2022-11-23T01:48:00.0963512Z crc32_pclmul 16384 0 2022-11-23T01:48:00.0963758Z ghash_clmulni_intel 16384 0 2022-11-23T01:48:00.0964023Z pcbc 16384 0 2022-11-23T01:48:00.0964279Z ata_piix 36864 0 2022-11-23T01:48:00.0964507Z aesni_intel 188416 0 2022-11-23T01:48:00.0964776Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:48:00.0965048Z libata 266240 1 ata_piix 2022-11-23T01:48:00.0965315Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:48:00.0965606Z glue_helper 16384 1 aesni_intel 2022-11-23T01:48:00.0965947Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:48:00.0966248Z mousedev 24576 0 2022-11-23T01:48:00.0966515Z pcc_cpufreq 16384 0 2022-11-23T01:48:00.0966812Z scsi_mod 245760 1 libata 2022-11-23T01:48:00.0967077Z evdev 20480 3 2022-11-23T01:48:00.0967310Z psmouse 32768 0 2022-11-23T01:48:00.0967559Z button 16384 0 2022-11-23T01:48:00.0967806Z ena 114688 0 2022-11-23T01:48:00.0968041Z xen_blkfront 49152 2 2022-11-23T01:48:00.0968295Z crc32c_intel 24576 0 2022-11-23T01:48:00.0968729Z autofs4 49152 2 2022-11-23T01:48:00.0969107Z + modinfo nvidia 2022-11-23T01:48:00.0969590Z filename: /lib/modules/4.14.252-195.483.amzn2.x86_64/kernel/drivers/video/nvidia.ko 2022-11-23T01:48:00.0969939Z firmware: nvidia/515.76/gsp.bin 2022-11-23T01:48:00.0970254Z alias: char-major-195-* 2022-11-23T01:48:00.0970526Z version: 515.76 2022-11-23T01:48:00.0970780Z supported: external 2022-11-23T01:48:00.0971034Z license: NVIDIA 2022-11-23T01:48:00.0971292Z srcversion: 51FD9DD90150B35351AFFBB 2022-11-23T01:48:00.0971695Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2022-11-23T01:48:00.0972094Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2022-11-23T01:48:00.0972383Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2022-11-23T01:48:00.0972720Z depends: i2c-core,drm 2022-11-23T01:48:00.0972984Z retpoline: Y 2022-11-23T01:48:00.0973206Z name: nvidia 2022-11-23T01:48:00.0973606Z vermagic: 4.14.252-195.483.amzn2.x86_64 SMP mod_unload modversions 2022-11-23T01:48:00.0973983Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2022-11-23T01:48:00.0974360Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2022-11-23T01:48:00.0974723Z parm: NVreg_ResmanDebugLevel:int 2022-11-23T01:48:00.0975027Z parm: NVreg_RmLogonRC:int 2022-11-23T01:48:00.0975317Z parm: NVreg_ModifyDeviceFiles:int 2022-11-23T01:48:00.0975623Z parm: NVreg_DeviceFileUID:int 2022-11-23T01:48:00.0975926Z parm: NVreg_DeviceFileGID:int 2022-11-23T01:48:00.0976209Z parm: NVreg_DeviceFileMode:int 2022-11-23T01:48:00.0976567Z parm: NVreg_InitializeSystemMemoryAllocations:int 2022-11-23T01:48:00.0976939Z parm: NVreg_UsePageAttributeTable:int 2022-11-23T01:48:00.0977259Z parm: NVreg_EnablePCIeGen3:int 2022-11-23T01:48:00.0977537Z parm: NVreg_EnableMSI:int 2022-11-23T01:48:00.0977828Z parm: NVreg_TCEBypassMode:int 2022-11-23T01:48:00.0978147Z parm: NVreg_EnableStreamMemOPs:int 2022-11-23T01:48:00.0978487Z parm: NVreg_RestrictProfilingToAdminUsers:int 2022-11-23T01:48:00.0978878Z parm: NVreg_PreserveVideoMemoryAllocations:int 2022-11-23T01:48:00.0979251Z parm: NVreg_EnableS0ixPowerManagement:int 2022-11-23T01:48:00.0979640Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2022-11-23T01:48:00.0980039Z parm: NVreg_DynamicPowerManagement:int 2022-11-23T01:48:00.0980452Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2022-11-23T01:48:00.0980837Z parm: NVreg_EnableGpuFirmware:int 2022-11-23T01:48:00.0981159Z parm: NVreg_EnableGpuFirmwareLogs:int 2022-11-23T01:48:00.0981518Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2022-11-23T01:48:00.0981884Z parm: NVreg_EnableUserNUMAManagement:int 2022-11-23T01:48:00.0982194Z parm: NVreg_MemoryPoolSize:int 2022-11-23T01:48:00.0982515Z parm: NVreg_KMallocHeapMaxSize:int 2022-11-23T01:48:00.0982839Z parm: NVreg_VMallocHeapMaxSize:int 2022-11-23T01:48:00.0983135Z parm: NVreg_IgnoreMMIOCheck:int 2022-11-23T01:48:00.0983439Z parm: NVreg_NvLinkDisable:int 2022-11-23T01:48:00.0983785Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2022-11-23T01:48:00.0984511Z parm: NVreg_RegisterPCIDriver:int 2022-11-23T01:48:00.0984836Z parm: NVreg_EnableDbgBreakpoint:int 2022-11-23T01:48:00.0985159Z parm: NVreg_RegistryDwords:charp 2022-11-23T01:48:00.0985485Z parm: NVreg_RegistryDwordsPerDevice:charp 2022-11-23T01:48:00.0985801Z parm: NVreg_RmMsg:charp 2022-11-23T01:48:00.0986092Z parm: NVreg_GpuBlacklist:charp 2022-11-23T01:48:00.0986395Z parm: NVreg_TemporaryFilePath:charp 2022-11-23T01:48:00.0986712Z parm: NVreg_ExcludedGpus:charp 2022-11-23T01:48:00.0987013Z parm: NVreg_DmaRemapPeerMmio:int 2022-11-23T01:48:00.0987334Z parm: rm_firmware_active:charp 2022-11-23T01:48:00.0987683Z + set +e 2022-11-23T01:48:00.0987944Z + nvidia-smi 2022-11-23T01:48:06.9272181Z Wed Nov 23 01:48:06 2022 2022-11-23T01:48:06.9272981Z +-----------------------------------------------------------------------------+ 2022-11-23T01:48:06.9273516Z | NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 | 2022-11-23T01:48:06.9274010Z |-------------------------------+----------------------+----------------------+ 2022-11-23T01:48:06.9274508Z | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 2022-11-23T01:48:06.9275337Z | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 2022-11-23T01:48:06.9275700Z | | | MIG M. | 2022-11-23T01:48:06.9275982Z |===============================+======================+======================| 2022-11-23T01:48:06.9324210Z | 0 Tesla M60 Off | 00000000:00:1B.0 Off | 0 | 2022-11-23T01:48:06.9324617Z | N/A 24C P0 38W / 150W | 0MiB / 7680MiB | 0% Default | 2022-11-23T01:48:06.9324958Z | | | N/A | 2022-11-23T01:48:06.9325437Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:48:06.9375383Z | 1 Tesla M60 Off | 00000000:00:1C.0 Off | 0 | 2022-11-23T01:48:06.9375760Z | N/A 28C P0 39W / 150W | 0MiB / 7680MiB | 0% Default | 2022-11-23T01:48:06.9376108Z | | | N/A | 2022-11-23T01:48:06.9376583Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:48:06.9427350Z | 2 Tesla M60 Off | 00000000:00:1D.0 Off | 1618012799 | 2022-11-23T01:48:06.9427740Z | N/A 22C P0 38W / 150W | 0MiB / 7680MiB | 0% Default | 2022-11-23T01:48:06.9428081Z | | | N/A | 2022-11-23T01:48:06.9428555Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:48:06.9480120Z | 3 Tesla M60 Off | 00000000:00:1E.0 Off | 1453669023 | 2022-11-23T01:48:06.9480502Z | N/A 29C P0 40W / 150W | 0MiB / 7680MiB | 69% Default | 2022-11-23T01:48:06.9480851Z | | | N/A | 2022-11-23T01:48:06.9481334Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:48:06.9481705Z 2022-11-23T01:48:06.9482143Z +-----------------------------------------------------------------------------+ 2022-11-23T01:48:06.9482507Z | Processes: | 2022-11-23T01:48:06.9482869Z | GPU GI CI PID Type Process name GPU Memory | 2022-11-23T01:48:06.9483216Z | ID ID Usage | 2022-11-23T01:48:06.9483516Z |=============================================================================| 2022-11-23T01:48:06.9494400Z | No running processes found | 2022-11-23T01:48:06.9494953Z +-----------------------------------------------------------------------------+ 2022-11-23T01:48:08.0814264Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:48:08.0815023Z + '[' 0 -eq 0 ']' 2022-11-23T01:48:08.0815358Z + echo 'INFO: Ignoring allowed status 0' 2022-11-23T01:48:08.0815653Z + set -e 2022-11-23T01:48:08.0815905Z INFO: Ignoring allowed status 0 2022-11-23T01:48:08.0819882Z == Installing nvidia container toolkit for amzn2 == 2022-11-23T01:48:08.0825846Z + sudo yum install -y yum-utils 2022-11-23T01:48:08.6528060Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:48:08.9408206Z Package yum-utils-1.1.31-46.amzn2.0.1.noarch already installed and latest version 2022-11-23T01:48:08.9409052Z Nothing to do 2022-11-23T01:48:08.9631786Z + sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:48:09.5202815Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:48:09.5514797Z adding repo from: https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:48:09.5515520Z grabbing file https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:48:09.5516362Z repo saved to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:48:09.5675213Z + sudo yum install -y nvidia-docker2 2022-11-23T01:48:10.1177389Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:48:10.1643824Z Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey 2022-11-23T01:48:10.1727415Z Importing GPG key 0xF796ECB0: 2022-11-23T01:48:10.1727861Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:48:10.1728336Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:48:10.1728782Z From : https://nvidia.github.io/libnvidia-container/gpgkey 2022-11-23T01:48:10.5913557Z Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-11-23T01:48:10.5986545Z Importing GPG key 0xF796ECB0: 2022-11-23T01:48:10.5987005Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:48:10.5987440Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:48:10.5987903Z From : https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-11-23T01:48:10.8370958Z Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey 2022-11-23T01:48:10.8448038Z Importing GPG key 0xF796ECB0: 2022-11-23T01:48:10.8448857Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:48:10.8449390Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:48:10.8449810Z From : https://nvidia.github.io/nvidia-docker/gpgkey 2022-11-23T01:48:12.6575955Z Resolving Dependencies 2022-11-23T01:48:12.6583561Z --> Running transaction check 2022-11-23T01:48:12.6584479Z ---> Package nvidia-docker2.noarch 0:2.11.0-1 will be installed 2022-11-23T01:48:12.6611322Z --> Processing Dependency: nvidia-container-toolkit >= 1.10.0-1 for package: nvidia-docker2-2.11.0-1.noarch 2022-11-23T01:48:12.7002360Z --> Running transaction check 2022-11-23T01:48:12.7002889Z ---> Package nvidia-container-toolkit.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:48:12.7166873Z --> Processing Dependency: nvidia-container-toolkit-base = 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:48:12.7177631Z --> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:48:12.7308218Z --> Processing Dependency: libnvidia-container-tools >= 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:48:12.7308796Z --> Running transaction check 2022-11-23T01:48:12.7309248Z ---> Package libnvidia-container-tools.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:48:12.7320130Z --> Processing Dependency: libnvidia-container1(x86-64) >= 1.11.0-1 for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:48:12.7347263Z --> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:48:12.7348053Z --> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:48:12.7348769Z ---> Package nvidia-container-toolkit-base.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:48:12.7349996Z --> Running transaction check 2022-11-23T01:48:12.7350485Z ---> Package libnvidia-container1.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:48:13.0391652Z --> Finished Dependency Resolution 2022-11-23T01:48:13.1177821Z 2022-11-23T01:48:13.1178148Z Dependencies Resolved 2022-11-23T01:48:13.1191904Z 2022-11-23T01:48:13.1192185Z ================================================================================ 2022-11-23T01:48:13.1192917Z Package Arch Version Repository Size 2022-11-23T01:48:13.1193398Z ================================================================================ 2022-11-23T01:48:13.1193666Z Installing: 2022-11-23T01:48:13.1194137Z nvidia-docker2 noarch 2.11.0-1 libnvidia-container 8.7 k 2022-11-23T01:48:13.1194915Z Installing for dependencies: 2022-11-23T01:48:13.1195529Z libnvidia-container-tools x86_64 1.11.0-1 libnvidia-container 49 k 2022-11-23T01:48:13.1196205Z libnvidia-container1 x86_64 1.11.0-1 libnvidia-container 1.0 M 2022-11-23T01:48:13.1196628Z nvidia-container-toolkit x86_64 1.11.0-1 libnvidia-container 780 k 2022-11-23T01:48:13.1197206Z nvidia-container-toolkit-base x86_64 1.11.0-1 libnvidia-container 2.5 M 2022-11-23T01:48:13.1197463Z 2022-11-23T01:48:13.1197580Z Transaction Summary 2022-11-23T01:48:13.1197874Z ================================================================================ 2022-11-23T01:48:13.1198174Z Install 1 Package (+4 Dependent packages) 2022-11-23T01:48:13.1198373Z 2022-11-23T01:48:13.1198502Z Total download size: 4.3 M 2022-11-23T01:48:13.1198779Z Installed size: 12 M 2022-11-23T01:48:13.1199023Z Downloading packages: 2022-11-23T01:48:13.2385980Z -------------------------------------------------------------------------------- 2022-11-23T01:48:13.2387637Z Total 36 MB/s | 4.3 MB 00:00 2022-11-23T01:48:13.2441921Z Running transaction check 2022-11-23T01:48:13.2630587Z Running transaction test 2022-11-23T01:48:13.2808882Z Transaction test succeeded 2022-11-23T01:48:13.2811045Z Running transaction 2022-11-23T01:48:18.4068537Z Installing : nvidia-container-toolkit-base-1.11.0-1.x86_64 1/5 2022-11-23T01:48:20.2145609Z Installing : libnvidia-container1-1.11.0-1.x86_64 2/5 2022-11-23T01:48:20.3331415Z Installing : libnvidia-container-tools-1.11.0-1.x86_64 3/5 2022-11-23T01:48:20.3585893Z Installing : nvidia-container-toolkit-1.11.0-1.x86_64 4/5 2022-11-23T01:48:20.4040979Z Installing : nvidia-docker2-2.11.0-1.noarch 5/5 2022-11-23T01:48:20.4151898Z Verifying : libnvidia-container1-1.11.0-1.x86_64 1/5 2022-11-23T01:48:20.4263263Z Verifying : nvidia-container-toolkit-base-1.11.0-1.x86_64 2/5 2022-11-23T01:48:20.4371800Z Verifying : nvidia-container-toolkit-1.11.0-1.x86_64 3/5 2022-11-23T01:48:20.4475070Z Verifying : libnvidia-container-tools-1.11.0-1.x86_64 4/5 2022-11-23T01:48:20.5289512Z Verifying : nvidia-docker2-2.11.0-1.noarch 5/5 2022-11-23T01:48:20.5289822Z 2022-11-23T01:48:20.5289940Z Installed: 2022-11-23T01:48:20.5290366Z nvidia-docker2.noarch 0:2.11.0-1 2022-11-23T01:48:20.5290668Z 2022-11-23T01:48:20.5290781Z Dependency Installed: 2022-11-23T01:48:20.5291215Z libnvidia-container-tools.x86_64 0:1.11.0-1 2022-11-23T01:48:20.5291622Z libnvidia-container1.x86_64 0:1.11.0-1 2022-11-23T01:48:20.5292095Z nvidia-container-toolkit.x86_64 0:1.11.0-1 2022-11-23T01:48:20.5292603Z nvidia-container-toolkit-base.x86_64 0:1.11.0-1 2022-11-23T01:48:20.5292851Z 2022-11-23T01:48:20.5292962Z Complete! 2022-11-23T01:48:20.6376252Z + sudo systemctl restart docker 2022-11-23T01:48:21.6020416Z Command completed after 1 attempt(s). 2022-11-23T01:48:21.6020664Z 2022-11-23T01:48:21.6023176Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:48:21.6076381Z ##[group]Run python3 -m pip install psutil==5.9.1 2022-11-23T01:48:21.6076777Z python3 -m pip install psutil==5.9.1 2022-11-23T01:48:21.6077086Z python3 -m pip install pynvml==11.4.1 2022-11-23T01:48:21.6077440Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2022-11-23T01:48:21.6077955Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2022-11-23T01:48:21.6092868Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:48:21.6093164Z env: 2022-11-23T01:48:21.6093485Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:21.6093735Z GPU_FLAG: --gpus all 2022-11-23T01:48:21.6093986Z ##[endgroup] 2022-11-23T01:48:22.8712201Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:48:23.2842437Z Collecting psutil==5.9.1 2022-11-23T01:48:23.3079114Z Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB) 2022-11-23T01:48:23.3903432Z Installing collected packages: psutil 2022-11-23T01:48:23.5556884Z Successfully installed psutil-5.9.1 2022-11-23T01:48:24.0729813Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:48:24.1974422Z Collecting pynvml==11.4.1 2022-11-23T01:48:24.2114123Z Downloading pynvml-11.4.1-py3-none-any.whl (46 kB) 2022-11-23T01:48:24.2631660Z Installing collected packages: pynvml 2022-11-23T01:48:24.3190454Z Successfully installed pynvml-11.4.1 2022-11-23T01:48:24.3734626Z Prepare all required actions 2022-11-23T01:48:24.3735019Z Getting action download info 2022-11-23T01:48:24.5687344Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:4a8bfae15cc25cc0785c1603ee87a9da8fd442ea) 2022-11-23T01:48:24.8209864Z Download action repository 'actions/download-artifact@v3' (SHA:9782bd6a9848b53b110e712e20e42d89988822b7) 2022-11-23T01:48:24.9526214Z ##[group]Run ./.github/actions/download-build-artifacts 2022-11-23T01:48:24.9526517Z with: 2022-11-23T01:48:24.9526794Z name: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T01:48:24.9527055Z env: 2022-11-23T01:48:24.9527291Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:24.9527557Z GPU_FLAG: --gpus all 2022-11-23T01:48:24.9527785Z ##[endgroup] 2022-11-23T01:48:24.9555721Z ##[group]Run seemethere/download-artifact-s3@v4 2022-11-23T01:48:24.9556021Z with: 2022-11-23T01:48:24.9556281Z name: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T01:48:24.9556585Z s3-bucket: gha-artifacts 2022-11-23T01:48:24.9556863Z region: us-east-1 2022-11-23T01:48:24.9557097Z env: 2022-11-23T01:48:24.9557330Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:24.9557575Z GPU_FLAG: --gpus all 2022-11-23T01:48:24.9557821Z ##[endgroup] 2022-11-23T01:48:25.4926744Z Found 1 objects with prefix pytorch/pytorch/3528394938/linux-bionic-cuda11.6-py3.9-gcc7/ 2022-11-23T01:48:25.4927580Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:48:33.5523587Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:48:33.5523920Z 2022-11-23T01:48:33.5527755Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:48:33.5529167Z Artifact download has finished successfully 2022-11-23T01:48:33.5858804Z ##[group]Run unzip -o artifacts.zip 2022-11-23T01:48:33.5859151Z unzip -o artifacts.zip 2022-11-23T01:48:33.5875385Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:48:33.5875604Z env: 2022-11-23T01:48:33.5875862Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:33.5876142Z GPU_FLAG: --gpus all 2022-11-23T01:48:33.5876417Z ##[endgroup] 2022-11-23T01:48:33.6012690Z Archive: artifacts.zip 2022-11-23T01:48:33.6013653Z creating: dist/ 2022-11-23T01:48:35.7497060Z inflating: dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:48:35.7497488Z creating: build/custom_test_artifacts/ 2022-11-23T01:48:35.7497912Z creating: build/custom_test_artifacts/custom-op-build/ 2022-11-23T01:48:35.7498391Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2022-11-23T01:48:35.7505335Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:48:35.7506193Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/ 2022-11-23T01:48:35.7506763Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:48:35.7507333Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:48:35.7507872Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:48:35.7508751Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:48:35.7510519Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:48:35.7511092Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:48:35.7511636Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:48:35.7513612Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:48:35.7514984Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:48:35.7516311Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:48:35.7516985Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:48:35.7518636Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:48:35.7519467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:48:35.7520061Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:48:35.7520630Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:48:35.7576586Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:48:35.7577349Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:48:35.7578079Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:48:35.7578823Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:48:35.7579542Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:48:35.7580239Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:48:35.7580944Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:48:35.7581636Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:48:35.7582322Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:48:35.7623836Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:48:35.7665216Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:48:35.7666280Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:48:35.7667351Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:48:35.7668017Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:48:35.7668886Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:48:35.7669530Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:48:35.7670281Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:48:35.7671668Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:48:35.7745754Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:48:35.7821832Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:48:35.7822517Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:48:35.7823077Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:48:35.7823791Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeError.log 2022-11-23T01:48:35.7824692Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2022-11-23T01:48:35.7825235Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2022-11-23T01:48:35.7825818Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2022-11-23T01:48:35.7826434Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2022-11-23T01:48:35.7827035Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2022-11-23T01:48:35.7827674Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2022-11-23T01:48:35.7828249Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2022-11-23T01:48:35.7828842Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2022-11-23T01:48:35.7829445Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2022-11-23T01:48:35.7830032Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2022-11-23T01:48:35.7830606Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2022-11-23T01:48:35.7851291Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2022-11-23T01:48:35.7968393Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2022-11-23T01:48:35.7969006Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2022-11-23T01:48:35.7969599Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2022-11-23T01:48:35.7970254Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2022-11-23T01:48:35.7970869Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2022-11-23T01:48:35.7971471Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2022-11-23T01:48:35.7972063Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2022-11-23T01:48:35.7972664Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2022-11-23T01:48:35.7973284Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2022-11-23T01:48:35.7973896Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2022-11-23T01:48:35.7974670Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2022-11-23T01:48:35.7994693Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2022-11-23T01:48:35.8078328Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2022-11-23T01:48:35.8079015Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:48:35.8079619Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:48:35.8080199Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2022-11-23T01:48:35.8080747Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2022-11-23T01:48:35.8081512Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2022-11-23T01:48:35.8082055Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2022-11-23T01:48:35.8084767Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2022-11-23T01:48:35.8086413Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2022-11-23T01:48:35.8086934Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2022-11-23T01:48:35.8181496Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2022-11-23T01:48:35.8246982Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2022-11-23T01:48:35.8247494Z creating: build/custom_test_artifacts/jit-hook-build/ 2022-11-23T01:48:35.8247969Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2022-11-23T01:48:35.8255133Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:48:35.8255712Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/ 2022-11-23T01:48:35.8256285Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:48:35.8256837Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:48:35.8257401Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:48:35.8258006Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:48:35.8259175Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:48:35.8259724Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:48:35.8260298Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:48:35.8262657Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:48:35.8264159Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:48:35.8265604Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:48:35.8266328Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:48:35.8267991Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:48:35.8268841Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:48:35.8269421Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:48:35.8270131Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:48:35.8325348Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:48:35.8326312Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:48:35.8327045Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:48:35.8327772Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:48:35.8328496Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:48:35.8329185Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:48:35.8329884Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:48:35.8330547Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:48:35.8331328Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:48:35.8372607Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:48:35.8414529Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:48:35.8415260Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:48:35.8415944Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:48:35.8416587Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:48:35.8417225Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:48:35.8417880Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:48:35.8418842Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:48:35.8420075Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:48:35.8495109Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:48:35.8569834Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:48:35.8570514Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:48:35.8571053Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:48:35.8571585Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeError.log 2022-11-23T01:48:35.8572148Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2022-11-23T01:48:35.8572691Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2022-11-23T01:48:35.8573369Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2022-11-23T01:48:35.8573995Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2022-11-23T01:48:35.8574600Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2022-11-23T01:48:35.8575187Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2022-11-23T01:48:35.8575752Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2022-11-23T01:48:35.8576345Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2022-11-23T01:48:35.8577161Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2022-11-23T01:48:35.8577751Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2022-11-23T01:48:35.8578323Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2022-11-23T01:48:35.8599365Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2022-11-23T01:48:35.8663562Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2022-11-23T01:48:35.8664661Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:48:35.8665263Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:48:35.8665822Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2022-11-23T01:48:35.8666542Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2022-11-23T01:48:35.8667097Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2022-11-23T01:48:35.8667611Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2022-11-23T01:48:35.8669401Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2022-11-23T01:48:35.8670172Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2022-11-23T01:48:35.8670820Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2022-11-23T01:48:35.8720894Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2022-11-23T01:48:35.8721417Z creating: build/custom_test_artifacts/custom-backend-build/ 2022-11-23T01:48:35.8721918Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2022-11-23T01:48:35.8728510Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:48:35.8729147Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/ 2022-11-23T01:48:35.8729731Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:48:35.8730322Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:48:35.8730889Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:48:35.8732698Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:48:35.8733929Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:48:35.8734524Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:48:35.8735093Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:48:35.8737299Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:48:35.8738515Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:48:35.8740185Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:48:35.8740814Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:48:35.8742136Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:48:35.8743144Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:48:35.8743758Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:48:35.8744850Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:48:35.8799450Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:48:35.8800236Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:48:35.8800992Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:48:35.8801762Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:48:35.8802497Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:48:35.8803218Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:48:35.8804105Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:48:35.8804853Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:48:35.8805557Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:48:35.8846872Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:48:35.8888141Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:48:35.8888915Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:48:35.8889612Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:48:35.8890264Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:48:35.8890930Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:48:35.8891603Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:48:35.8892470Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:48:35.8894220Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:48:35.8968337Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:48:35.9044265Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:48:35.9044992Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:48:35.9045560Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:48:35.9046146Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeError.log 2022-11-23T01:48:35.9046717Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2022-11-23T01:48:35.9047297Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2022-11-23T01:48:35.9047902Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2022-11-23T01:48:35.9048566Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2022-11-23T01:48:35.9049397Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2022-11-23T01:48:35.9050023Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2022-11-23T01:48:35.9050638Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2022-11-23T01:48:35.9051277Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2022-11-23T01:48:35.9051910Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2022-11-23T01:48:35.9052548Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2022-11-23T01:48:35.9053158Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2022-11-23T01:48:35.9056533Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2022-11-23T01:48:35.9205437Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2022-11-23T01:48:35.9206141Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2022-11-23T01:48:35.9206797Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2022-11-23T01:48:35.9207448Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2022-11-23T01:48:35.9208112Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2022-11-23T01:48:35.9208748Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2022-11-23T01:48:35.9209397Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2022-11-23T01:48:35.9210048Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2022-11-23T01:48:35.9210695Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2022-11-23T01:48:35.9211334Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2022-11-23T01:48:35.9211978Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2022-11-23T01:48:35.9232129Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2022-11-23T01:48:35.9290860Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2022-11-23T01:48:35.9291585Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:48:35.9292247Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:48:35.9292822Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2022-11-23T01:48:35.9293390Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2022-11-23T01:48:35.9293960Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2022-11-23T01:48:35.9294515Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2022-11-23T01:48:35.9296720Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2022-11-23T01:48:35.9297635Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2022-11-23T01:48:35.9298277Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2022-11-23T01:48:35.9422813Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2022-11-23T01:48:35.9468915Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2022-11-23T01:48:35.9469435Z creating: build/lib/ 2022-11-23T01:48:35.9469726Z inflating: build/lib/libclog.a 2022-11-23T01:48:35.9537979Z inflating: build/lib/libgtest.a 2022-11-23T01:48:35.9548584Z inflating: build/lib/libpthreadpool.a 2022-11-23T01:48:35.9557989Z inflating: build/lib/libittnotify.a 2022-11-23T01:48:35.9667166Z inflating: build/lib/libprotobuf-lite.a 2022-11-23T01:48:35.9760997Z inflating: build/lib/libbenchmark.a 2022-11-23T01:48:35.9793910Z inflating: build/lib/libtensorpipe_uv.a 2022-11-23T01:48:35.9928895Z inflating: build/lib/libgloo.a 2022-11-23T01:48:36.0474921Z inflating: build/lib/libprotobuf.a 2022-11-23T01:48:36.0554459Z inflating: build/lib/libasmjit.a 2022-11-23T01:48:36.0588284Z inflating: build/lib/libfmt.a 2022-11-23T01:48:36.0588968Z inflating: build/lib/libfoxi_loader.a 2022-11-23T01:48:36.0589733Z inflating: build/lib/libcaffe2_nvrtc.so 2022-11-23T01:48:36.0672705Z inflating: build/lib/libc10.so 2022-11-23T01:48:36.0673251Z inflating: build/lib/libtorch_global_deps.so 2022-11-23T01:48:36.0684037Z inflating: build/lib/libcpuinfo.a 2022-11-23T01:48:36.0692949Z inflating: build/lib/libcpuinfo_internals.a 2022-11-23T01:48:36.0694603Z inflating: build/lib/libnnpack_reference_layers.a 2022-11-23T01:48:36.1285142Z inflating: build/lib/libprotoc.a 2022-11-23T01:48:36.1304264Z inflating: build/lib/libgmock.a 2022-11-23T01:48:36.1305297Z inflating: build/lib/libgtest_main.a 2022-11-23T01:48:36.1305980Z inflating: build/lib/libbenchmark_main.a 2022-11-23T01:48:36.1450343Z inflating: build/lib/libXNNPACK.a 2022-11-23T01:48:37.1531360Z inflating: build/lib/libdnnl.a 2022-11-23T01:48:37.2204981Z inflating: build/lib/libtensorpipe.a 2022-11-23T01:48:37.2259973Z inflating: build/lib/libc10_cuda.so 2022-11-23T01:48:37.2276780Z inflating: build/lib/libqnnpack.a 2022-11-23T01:48:37.3861101Z inflating: build/lib/libfbgemm.a 2022-11-23T01:48:37.3861723Z inflating: build/lib/libgmock_main.a 2022-11-23T01:48:37.3885554Z inflating: build/lib/libpytorch_qnnpack.a 2022-11-23T01:48:37.5063879Z inflating: build/lib/libdnnl_graph.a 2022-11-23T01:48:37.5362967Z inflating: build/lib/libtensorpipe_cuda.a 2022-11-23T01:48:37.5891601Z inflating: build/lib/libkineto.a 2022-11-23T01:48:37.5938201Z inflating: build/lib/libcaffe2_protos.a 2022-11-23T01:48:37.5987488Z inflating: build/lib/libonnx_proto.a 2022-11-23T01:48:37.6009393Z inflating: build/lib/libnnpack.a 2022-11-23T01:48:37.6707395Z inflating: build/lib/libonnx.a 2022-11-23T01:48:37.7153469Z inflating: build/lib/libgloo_cuda.a 2022-11-23T01:48:40.1516531Z inflating: build/lib/libtorch_cpu.so 2022-11-23T01:48:42.3445222Z inflating: build/lib/libtorch_cuda.so 2022-11-23T01:48:42.3445677Z inflating: build/lib/libtorch.so 2022-11-23T01:48:42.3448027Z inflating: build/lib/libc10d_cuda_test.so 2022-11-23T01:48:43.3602153Z inflating: build/lib/libtorch_cuda_linalg.so 2022-11-23T01:48:43.3626222Z inflating: build/lib/libjitbackend_test.so 2022-11-23T01:48:43.3689504Z inflating: build/lib/libtorchbind_test.so 2022-11-23T01:48:43.3720766Z inflating: build/lib/libbackend_with_compiler.so 2022-11-23T01:48:43.3724824Z inflating: build/lib/libshm.so 2022-11-23T01:48:43.5568379Z inflating: build/lib/libtorch_python.so 2022-11-23T01:48:43.5609443Z inflating: build/lib/libnnapi_backend.so 2022-11-23T01:48:43.5610053Z creating: build/bin/ 2022-11-23T01:48:43.5663335Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2022-11-23T01:48:43.5719582Z inflating: build/bin/c10_DeviceGuard_test 2022-11-23T01:48:43.5775068Z inflating: build/bin/c10_Device_test 2022-11-23T01:48:43.5839556Z inflating: build/bin/c10_DispatchKeySet_test 2022-11-23T01:48:43.5892961Z inflating: build/bin/c10_StreamGuard_test 2022-11-23T01:48:43.5945329Z inflating: build/bin/c10_SymInt_test 2022-11-23T01:48:43.6006241Z inflating: build/bin/c10_InlineDeviceGuard_test 2022-11-23T01:48:43.6066572Z inflating: build/bin/c10_InlineStreamGuard_test 2022-11-23T01:48:43.6128462Z inflating: build/bin/c10_SizesAndStrides_test 2022-11-23T01:48:43.6180487Z inflating: build/bin/c10_Array_test 2022-11-23T01:48:43.6238508Z inflating: build/bin/c10_Bitset_test 2022-11-23T01:48:43.6294258Z inflating: build/bin/c10_C++17_test 2022-11-23T01:48:43.6345466Z inflating: build/bin/c10_ConstexprCrc_test 2022-11-23T01:48:43.6399010Z inflating: build/bin/c10_DeadlockDetection_test 2022-11-23T01:48:43.6453501Z inflating: build/bin/c10_Half_test 2022-11-23T01:48:43.6515254Z inflating: build/bin/c10_LeftRight_test 2022-11-23T01:48:43.6583183Z inflating: build/bin/c10_Metaprogramming_test 2022-11-23T01:48:43.6741659Z inflating: build/bin/c10_SmallVectorTest 2022-11-23T01:48:43.6798126Z inflating: build/bin/c10_Synchronized_test 2022-11-23T01:48:43.6859873Z inflating: build/bin/c10_ThreadLocal_test 2022-11-23T01:48:43.6917190Z inflating: build/bin/c10_TypeIndex_test 2022-11-23T01:48:43.6971747Z inflating: build/bin/c10_TypeList_test 2022-11-23T01:48:43.7023932Z inflating: build/bin/c10_TypeTraits_test 2022-11-23T01:48:43.7080455Z inflating: build/bin/c10_accumulate_test 2022-11-23T01:48:43.7140951Z inflating: build/bin/c10_bfloat16_test 2022-11-23T01:48:43.7200599Z inflating: build/bin/c10_complex_math_test 2022-11-23T01:48:43.7260377Z inflating: build/bin/c10_complex_test 2022-11-23T01:48:43.7380403Z inflating: build/bin/c10_either_test 2022-11-23T01:48:43.7437843Z inflating: build/bin/c10_exception_test 2022-11-23T01:48:43.7493235Z inflating: build/bin/c10_flags_test 2022-11-23T01:48:43.7680810Z inflating: build/bin/c10_intrusive_ptr_test 2022-11-23T01:48:43.7735806Z inflating: build/bin/c10_irange_test 2022-11-23T01:48:43.7799166Z inflating: build/bin/c10_logging_test 2022-11-23T01:48:43.7880049Z inflating: build/bin/c10_optional_test 2022-11-23T01:48:43.7940214Z inflating: build/bin/c10_registry_test 2022-11-23T01:48:43.8008024Z inflating: build/bin/c10_ordered_preserving_dict_test 2022-11-23T01:48:43.8064542Z inflating: build/bin/c10_tempfile_test 2022-11-23T01:48:43.8128475Z inflating: build/bin/c10_string_view_test 2022-11-23T01:48:43.8189803Z inflating: build/bin/c10_typeid_test 2022-11-23T01:48:43.8251088Z inflating: build/bin/c10_intrusive_ptr_benchmark 2022-11-23T01:48:43.8785526Z inflating: build/bin/protoc-3.13.0.0 2022-11-23T01:48:43.9321312Z inflating: build/bin/protoc 2022-11-23T01:48:43.9373785Z inflating: build/bin/c10_cuda_CUDATest 2022-11-23T01:48:43.9738541Z inflating: build/bin/vec_test_all_types_AVX2 2022-11-23T01:48:44.0061996Z inflating: build/bin/vec_test_all_types_DEFAULT 2022-11-23T01:48:44.0121678Z inflating: build/bin/FileStoreTest 2022-11-23T01:48:44.0180510Z inflating: build/bin/HashStoreTest 2022-11-23T01:48:44.0246072Z inflating: build/bin/TCPStoreTest 2022-11-23T01:48:44.0261944Z inflating: build/bin/ProcessGroupMPITest 2022-11-23T01:48:44.0264157Z inflating: build/bin/example_allreduce 2022-11-23T01:48:44.0322646Z inflating: build/bin/Dimname_test 2022-11-23T01:48:44.0403814Z inflating: build/bin/Dict_test 2022-11-23T01:48:44.0474571Z inflating: build/bin/MaybeOwned_test 2022-11-23T01:48:44.0536572Z inflating: build/bin/NamedTensor_test 2022-11-23T01:48:44.0600601Z inflating: build/bin/atest 2022-11-23T01:48:44.0665056Z inflating: build/bin/apply_utils_test 2022-11-23T01:48:44.0733220Z inflating: build/bin/basic 2022-11-23T01:48:44.0792573Z inflating: build/bin/broadcast_test 2022-11-23T01:48:44.0856199Z inflating: build/bin/cpu_generator_test 2022-11-23T01:48:44.0914526Z inflating: build/bin/cpu_profiling_allocator_test 2022-11-23T01:48:44.0968870Z inflating: build/bin/dispatch_key_set_test 2022-11-23T01:48:44.1066006Z inflating: build/bin/cpu_rng_test 2022-11-23T01:48:44.1121038Z inflating: build/bin/dlconvertor_test 2022-11-23T01:48:44.1184107Z inflating: build/bin/extension_backend_test 2022-11-23T01:48:44.1245520Z inflating: build/bin/half_test 2022-11-23T01:48:44.1349960Z inflating: build/bin/ivalue_test 2022-11-23T01:48:44.1404975Z inflating: build/bin/lazy_tensor_test 2022-11-23T01:48:44.1463379Z inflating: build/bin/memory_format_test 2022-11-23T01:48:44.1522612Z inflating: build/bin/math_kernel_test 2022-11-23T01:48:44.1581059Z inflating: build/bin/memory_overlapping_test 2022-11-23T01:48:44.1637189Z inflating: build/bin/operator_name_test 2022-11-23T01:48:44.1694589Z inflating: build/bin/mobile_memory_cleanup 2022-11-23T01:48:44.1756111Z inflating: build/bin/native_test 2022-11-23T01:48:44.1811607Z inflating: build/bin/operators_test 2022-11-23T01:48:44.1868855Z inflating: build/bin/packedtensoraccessor_test 2022-11-23T01:48:44.1941065Z inflating: build/bin/pow_test 2022-11-23T01:48:44.2004231Z inflating: build/bin/quantized_test 2022-11-23T01:48:44.2058719Z inflating: build/bin/reportMemoryUsage_test 2022-11-23T01:48:44.2112638Z inflating: build/bin/reduce_ops_test 2022-11-23T01:48:44.2174301Z inflating: build/bin/scalar_tensor_test 2022-11-23T01:48:44.2237491Z inflating: build/bin/scalar_test 2022-11-23T01:48:44.2294300Z inflating: build/bin/stride_properties_test 2022-11-23T01:48:44.2381337Z inflating: build/bin/tensor_iterator_test 2022-11-23T01:48:44.2443301Z inflating: build/bin/type_ptr_test 2022-11-23T01:48:44.2445187Z inflating: build/bin/thread_init_test 2022-11-23T01:48:44.2507247Z inflating: build/bin/test_parallel 2022-11-23T01:48:44.2560925Z inflating: build/bin/variant_test 2022-11-23T01:48:44.2628515Z inflating: build/bin/type_test 2022-11-23T01:48:44.2685221Z inflating: build/bin/undefined_tensor_test 2022-11-23T01:48:44.2685974Z inflating: build/bin/verify_api_visibility 2022-11-23T01:48:44.2763199Z inflating: build/bin/vmap_test 2022-11-23T01:48:44.2819637Z inflating: build/bin/weakref_test 2022-11-23T01:48:44.2885617Z inflating: build/bin/IListRef_test 2022-11-23T01:48:44.2939317Z inflating: build/bin/xla_tensor_test 2022-11-23T01:48:44.2996009Z inflating: build/bin/wrapdim_test 2022-11-23T01:48:44.3117906Z inflating: build/bin/List_test 2022-11-23T01:48:44.3251845Z inflating: build/bin/kernel_function_legacy_test 2022-11-23T01:48:44.3359635Z inflating: build/bin/kernel_function_test 2022-11-23T01:48:44.3431580Z inflating: build/bin/KernelFunction_test 2022-11-23T01:48:44.3572791Z inflating: build/bin/kernel_lambda_legacy_test 2022-11-23T01:48:44.3687673Z inflating: build/bin/kernel_lambda_test 2022-11-23T01:48:44.3752952Z inflating: build/bin/kernel_stackbased_test 2022-11-23T01:48:44.3808660Z inflating: build/bin/CppSignature_test 2022-11-23T01:48:44.3914312Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2022-11-23T01:48:44.3966654Z inflating: build/bin/op_allowlist_test 2022-11-23T01:48:44.4288426Z inflating: build/bin/op_registration_test 2022-11-23T01:48:44.4346140Z inflating: build/bin/inline_container_test 2022-11-23T01:48:44.4408301Z inflating: build/bin/backend_fallback_test 2022-11-23T01:48:44.4464636Z inflating: build/bin/cuda_apply_test 2022-11-23T01:48:44.4523882Z inflating: build/bin/cuda_caching_host_allocator_test 2022-11-23T01:48:44.4589039Z inflating: build/bin/cuda_atomic_ops_test 2022-11-23T01:48:44.4662624Z inflating: build/bin/cuda_complex_math_test 2022-11-23T01:48:44.4717652Z inflating: build/bin/cuda_device_test 2022-11-23T01:48:44.4782028Z inflating: build/bin/cuda_complex_test 2022-11-23T01:48:44.4847645Z inflating: build/bin/cuda_cub_test 2022-11-23T01:48:44.4902324Z inflating: build/bin/cuda_dlconvertor_test 2022-11-23T01:48:44.4957122Z inflating: build/bin/cuda_integer_divider_test 2022-11-23T01:48:44.5031228Z inflating: build/bin/cuda_distributions_test 2022-11-23T01:48:44.5095047Z inflating: build/bin/cuda_generator_test 2022-11-23T01:48:44.5150081Z inflating: build/bin/cuda_half_test 2022-11-23T01:48:44.5207649Z inflating: build/bin/cuda_reportMemoryUsage_test 2022-11-23T01:48:44.5273795Z inflating: build/bin/cuda_stream_test 2022-11-23T01:48:44.5328056Z inflating: build/bin/cuda_optional_test 2022-11-23T01:48:44.5380703Z inflating: build/bin/cuda_cudnn_test 2022-11-23T01:48:44.5437098Z inflating: build/bin/cuda_packedtensoraccessor_test 2022-11-23T01:48:44.5492770Z inflating: build/bin/cuda_vectorized_test 2022-11-23T01:48:44.5511612Z inflating: build/bin/tutorial_tensorexpr 2022-11-23T01:48:44.5582297Z inflating: build/bin/ProcessGroupGlooTest 2022-11-23T01:48:44.5647034Z inflating: build/bin/ProcessGroupGlooAsyncTest 2022-11-23T01:48:44.5709901Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2022-11-23T01:48:44.5776179Z inflating: build/bin/ProcessGroupNCCLTest 2022-11-23T01:48:44.5834307Z inflating: build/bin/ProcessGroupUCCTest 2022-11-23T01:48:44.5892790Z inflating: build/bin/test_dist_autograd 2022-11-23T01:48:44.5968518Z inflating: build/bin/test_cpp_rpc 2022-11-23T01:48:44.5970133Z inflating: build/bin/parallel_benchmark 2022-11-23T01:48:44.6048382Z inflating: build/bin/test_mobile_nnc 2022-11-23T01:48:44.6059700Z inflating: build/bin/aot_model_compiler_test 2022-11-23T01:48:44.6993680Z inflating: build/bin/test_tensorexpr 2022-11-23T01:48:44.7388844Z inflating: build/bin/test_lazy 2022-11-23T01:48:44.7394833Z inflating: build/bin/torch_shm_manager 2022-11-23T01:48:44.8747957Z inflating: build/bin/test_api 2022-11-23T01:48:44.9989872Z inflating: build/bin/test_jit 2022-11-23T01:48:44.9990433Z inflating: .pytorch-test-times.json 2022-11-23T01:48:45.0021341Z ##[group]Run df -H 2022-11-23T01:48:45.0021621Z df -H 2022-11-23T01:48:45.0037030Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:48:45.0037334Z env: 2022-11-23T01:48:45.0037572Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:45.0037827Z GPU_FLAG: --gpus all 2022-11-23T01:48:45.0038068Z ##[endgroup] 2022-11-23T01:48:45.0081168Z Filesystem Size Used Avail Use% Mounted on 2022-11-23T01:48:45.0081516Z devtmpfs 258G 0 258G 0% /dev 2022-11-23T01:48:45.0081813Z tmpfs 258G 0 258G 0% /dev/shm 2022-11-23T01:48:45.0082091Z tmpfs 258G 746k 258G 1% /run 2022-11-23T01:48:45.0082389Z tmpfs 258G 0 258G 0% /sys/fs/cgroup 2022-11-23T01:48:45.0082664Z /dev/xvda1 162G 30G 132G 19% / 2022-11-23T01:48:45.0083127Z tmpfs 52G 0 52G 0% /run/user/0 2022-11-23T01:48:45.0107105Z ##[group]Run .github/scripts/parse_ref.py 2022-11-23T01:48:45.0107516Z .github/scripts/parse_ref.py 2022-11-23T01:48:45.0119428Z shell: /usr/bin/bash -e {0} 2022-11-23T01:48:45.0119675Z env: 2022-11-23T01:48:45.0119894Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:45.0120165Z GPU_FLAG: --gpus all 2022-11-23T01:48:45.0120575Z ##[endgroup] 2022-11-23T01:48:45.0422341Z ##[group]Run set -x 2022-11-23T01:48:45.0422733Z set -x 2022-11-23T01:48:45.0422968Z  2022-11-23T01:48:45.0423388Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2022-11-23T01:48:45.0424277Z  TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-11-23T01:48:45.0424643Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2022-11-23T01:48:45.0424952Z  TEST_COMMAND=.jenkins/caffe2/test.sh 2022-11-23T01:48:45.0425227Z else 2022-11-23T01:48:45.0425504Z  TEST_COMMAND=.jenkins/pytorch/test.sh 2022-11-23T01:48:45.0425777Z fi 2022-11-23T01:48:45.0425976Z  2022-11-23T01:48:45.0426420Z COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}") 2022-11-23T01:48:45.0426738Z  2022-11-23T01:48:45.0427019Z # sanitize the input commit message and PR body here: 2022-11-23T01:48:45.0427609Z # 2022-11-23T01:48:45.0427984Z # trim all new lines from commit messages + PR_BODY to avoid issues with batch environment 2022-11-23T01:48:45.0428454Z # variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028 2022-11-23T01:48:45.0428917Z COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}" 2022-11-23T01:48:45.0429227Z PR_BODY="${PR_BODY//[$'\n\r']}" 2022-11-23T01:48:45.0429455Z  2022-11-23T01:48:45.0429799Z # then trim all special characters like single and double quotes to avoid unescaped inputs to 2022-11-23T01:48:45.0430162Z # wreak havoc internally 2022-11-23T01:48:45.0430523Z export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}" 2022-11-23T01:48:45.0430821Z export PR_BODY="${PR_BODY//[\'\"]}" 2022-11-23T01:48:45.0431076Z  2022-11-23T01:48:45.0431376Z # detached container should get cleaned up by teardown_ec2_linux 2022-11-23T01:48:45.0431758Z # TODO: Stop building test binaries as part of the build phase 2022-11-23T01:48:45.0432121Z # Used for GPU_FLAG since that doesn't play nice 2022-11-23T01:48:45.0432442Z # shellcheck disable=SC2086,SC2090 2022-11-23T01:48:45.0432737Z container_name=$(docker run \ 2022-11-23T01:48:45.0433170Z  ${GPU_FLAG:-} \ 2022-11-23T01:48:45.0433446Z  -e BUILD_ENVIRONMENT \ 2022-11-23T01:48:45.0433718Z  -e PR_NUMBER \ 2022-11-23T01:48:45.0433969Z  -e GITHUB_ACTIONS \ 2022-11-23T01:48:45.0434232Z  -e BASE_SHA \ 2022-11-23T01:48:45.0434483Z  -e BRANCH \ 2022-11-23T01:48:45.0434709Z  -e SHA1 \ 2022-11-23T01:48:45.0434967Z  -e AWS_DEFAULT_REGION \ 2022-11-23T01:48:45.0435234Z  -e IN_WHEEL_TEST \ 2022-11-23T01:48:45.0435478Z  -e SHARD_NUMBER \ 2022-11-23T01:48:45.0435742Z  -e TEST_CONFIG \ 2022-11-23T01:48:45.0436014Z  -e NUM_TEST_SHARDS \ 2022-11-23T01:48:45.0436415Z  -e PR_BODY \ 2022-11-23T01:48:45.0436672Z  -e COMMIT_MESSAGES \ 2022-11-23T01:48:45.0436954Z  -e PYTORCH_RETRY_TEST_CASES \ 2022-11-23T01:48:45.0437237Z  -e PYTORCH_OVERRIDE_FLAKY_SIGNAL \ 2022-11-23T01:48:45.0437515Z  -e PR_LABELS \ 2022-11-23T01:48:45.0437797Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2022-11-23T01:48:45.0438076Z  -e SCCACHE_BUCKET \ 2022-11-23T01:48:45.0438329Z  -e SCCACHE_S3_KEY_PREFIX \ 2022-11-23T01:48:45.0438590Z  -e XLA_CUDA \ 2022-11-23T01:48:45.0438867Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2022-11-23T01:48:45.0439161Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2022-11-23T01:48:45.0439478Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2022-11-23T01:48:45.0439820Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2022-11-23T01:48:45.0440115Z  --ulimit stack=10485760:83886080 \ 2022-11-23T01:48:45.0440425Z  --security-opt seccomp=unconfined \ 2022-11-23T01:48:45.0440811Z  --cap-add=SYS_PTRACE \ 2022-11-23T01:48:45.0441086Z  --ipc=host \ 2022-11-23T01:48:45.0441328Z  --shm-size="${SHM_SIZE}" \ 2022-11-23T01:48:45.0441581Z  --tty \ 2022-11-23T01:48:45.0441815Z  --detach \ 2022-11-23T01:48:45.0442058Z  --name="${container_name}" \ 2022-11-23T01:48:45.0442327Z  --user jenkins \ 2022-11-23T01:48:45.0442641Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2022-11-23T01:48:45.0442953Z  -w /var/lib/jenkins/workspace \ 2022-11-23T01:48:45.0443225Z  "${DOCKER_IMAGE}" 2022-11-23T01:48:45.0443458Z ) 2022-11-23T01:48:45.0443731Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2022-11-23T01:48:45.0444160Z docker exec -t "${container_name}" sh -c "pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2022-11-23T01:48:45.0456796Z shell: /usr/bin/bash -e {0} 2022-11-23T01:48:45.0457036Z env: 2022-11-23T01:48:45.0457274Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:48:45.0457516Z GPU_FLAG: --gpus all 2022-11-23T01:48:45.0457834Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T01:48:45.0458135Z PR_NUMBER: 2022-11-23T01:48:45.0458346Z BRANCH: master 2022-11-23T01:48:45.0458614Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:48:45.0458921Z BASE_SHA: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:48:45.0459193Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T01:48:45.0459470Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T01:48:45.0459736Z TEST_CONFIG: multigpu 2022-11-23T01:48:45.0459978Z SHARD_NUMBER: 1 2022-11-23T01:48:45.0460191Z NUM_TEST_SHARDS: 1 2022-11-23T01:48:45.0460421Z PR_BODY: 2022-11-23T01:48:45.0460719Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2022-11-23T01:48:45.0461022Z SCCACHE_S3_KEY_PREFIX: periodic 2022-11-23T01:48:45.0461279Z SHM_SIZE: 2g 2022-11-23T01:48:45.0461753Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:48:45.0462189Z XLA_CUDA: 2022-11-23T01:48:45.0462528Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2022-11-23T01:48:45.0462893Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2022-11-23T01:48:45.0463167Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2022-11-23T01:48:45.0463428Z ##[endgroup] 2022-11-23T01:48:45.0496629Z + [[ multigpu == \m\u\l\t\i\g\p\u ]] 2022-11-23T01:48:45.0497132Z + TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-11-23T01:48:45.0501451Z ++ git cherry -v origin/master 2022-11-23T01:48:45.0519701Z + COMMIT_MESSAGES= 2022-11-23T01:48:45.0520505Z + COMMIT_MESSAGES= 2022-11-23T01:48:45.0520814Z + PR_BODY= 2022-11-23T01:48:45.0521053Z + export COMMIT_MESSAGES= 2022-11-23T01:48:45.0521313Z + COMMIT_MESSAGES= 2022-11-23T01:48:45.0521572Z + export PR_BODY= 2022-11-23T01:48:45.0521816Z + PR_BODY= 2022-11-23T01:48:45.0533092Z +++ nproc --ignore=2 2022-11-23T01:48:45.0580289Z ++ docker run --gpus all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e PR_BODY -e COMMIT_MESSAGES -e PYTORCH_RETRY_TEST_CASES -e PYTORCH_OVERRIDE_FLAKY_SIGNAL -e PR_LABELS -e MAX_JOBS=62 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS --env-file=/tmp/github_env_3528394938 --ulimit stack=10485760:83886080 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:48:59.3279365Z + container_name=4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T01:48:59.3279956Z + echo DOCKER_CONTAINER_ID=4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T01:48:59.3285053Z ++ echo dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:48:59.3286633Z + docker exec -t 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 sh -c 'pip install dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl[opt-einsum] && .jenkins/pytorch/multigpu-test.sh' 2022-11-23T01:48:59.9346642Z Processing ./dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:49:00.9697553Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (2.6.3) 2022-11-23T01:49:00.9699854Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (1.11.1) 2022-11-23T01:49:00.9703753Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (4.4.0) 2022-11-23T01:49:00.9721742Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (3.3.0) 2022-11-23T01:49:00.9802444Z Requirement already satisfied: numpy>=1.7 in /opt/conda/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==1.14.0a0+git1cfd385) (1.21.2) 2022-11-23T01:49:01.0027714Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch==1.14.0a0+git1cfd385) (1.2.1) 2022-11-23T01:49:02.0146711Z Installing collected packages: torch 2022-11-23T01:49:12.2718123Z Successfully installed torch-1.14.0a0+git1cfd385 2022-11-23T01:49:12.3496505Z ++ [[ linux-bionic-cuda11.6-py3.9-gcc7 == *rocm* ]] 2022-11-23T01:49:12.3496890Z ++ BUILD_TEST_LIBTORCH=0 2022-11-23T01:49:12.3497220Z + echo 'Testing pytorch' 2022-11-23T01:49:12.3497473Z Testing pytorch 2022-11-23T01:49:12.3497917Z + python test/run_test.py --verbose -i distributed/test_c10d_common 2022-11-23T01:49:14.7014496Z Ignoring disabled issues: [] 2022-11-23T01:49:14.7543010Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T01:49:14.7543630Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T01:49:14.7544517Z Selected tests: 2022-11-23T01:49:14.7544782Z distributed/test_c10d_common 2022-11-23T01:49:14.7572713Z Prioritized test from test file changes. 2022-11-23T01:49:14.7573054Z reordering tests for PR: 2022-11-23T01:49:14.7573330Z prioritized: [] 2022-11-23T01:49:14.7573799Z the rest: ['distributed/test_c10d_common'] 2022-11-23T01:49:14.7573998Z 2022-11-23T01:49:14.7578993Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T01:49:14.7814857Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T01:49:14.8299483Z parallel (file granularity) tests: 2022-11-23T01:49:14.8299900Z 2022-11-23T01:49:14.8300175Z serial (file granularity) tests: 2022-11-23T01:49:14.8300454Z distributed/test_c10d_common 2022-11-23T01:49:17.0937836Z Ignoring disabled issues: [] 2022-11-23T01:49:17.5061237Z Running distributed/test_c10d_common ... [2022-11-23 01:49:17.505477] 2022-11-23T01:49:17.5064479Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_common.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:49:17.505989] 2022-11-23T01:50:27.3218043Z 2022-11-23T01:50:27.3218904Z Expand the folded group to see the log file of distributed/test_c10d_common 2022-11-23T01:50:27.3230060Z ##[group]PRINTING LOG FILE of distributed/test_c10d_common (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_common_3262w7td) 2022-11-23T01:50:27.3231722Z ]> 2022-11-23T01:50:27.3232473Z test_debug_level (__main__.CommTest) 2022-11-23T01:50:27.3234175Z , <__main__.ComputeBucketAssignmentTest testMethod=test_multi_limit_single_dtype>, <__main__.ComputeBucketAssignmentTest testMethod=test_single_limit_multi_dtype>, <__main__.ComputeBucketAssignmentTest testMethod=test_single_limit_single_dtype>]> 2022-11-23T01:50:27.3236060Z test_multi_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T01:50:27.3236901Z test_multi_limit_single_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T01:50:27.3237806Z test_single_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T01:50:27.3238696Z test_single_limit_single_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T01:50:27.3240867Z , <__main__.PythonProcessGroupExtensionTest testMethod=test_collectives>, <__main__.PythonProcessGroupExtensionTest testMethod=test_get_backend_name>, <__main__.PythonProcessGroupExtensionTest testMethod=test_send_recv>]> 2022-11-23T01:50:27.3242732Z test_backend_class_attr (__main__.PythonProcessGroupExtensionTest) 2022-11-23T01:50:27.3243568Z test_collectives (__main__.PythonProcessGroupExtensionTest) 2022-11-23T01:50:27.3244485Z test_get_backend_name (__main__.PythonProcessGroupExtensionTest) 2022-11-23T01:50:27.3245326Z test_send_recv (__main__.PythonProcessGroupExtensionTest) 2022-11-23T01:50:27.3246656Z , <__main__.ReduceOpTest testMethod=test_reduceop_copyable>, <__main__.ReduceOpTest testMethod=test_reduceop_pickle>]> 2022-11-23T01:50:27.3248372Z test_op_isinstance_of_reduceop (__main__.ReduceOpTest) 2022-11-23T01:50:27.3248997Z test_reduceop_copyable (__main__.ReduceOpTest) 2022-11-23T01:50:27.3249784Z test_reduceop_pickle (__main__.ReduceOpTest) 2022-11-23T01:50:27.3251122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3252136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3253320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3254271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3254823Z 2022-11-23T01:50:27.3255086Z Running tests... 2022-11-23T01:50:27.3255856Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3256989Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3257938Z test_debug_level (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3258895Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 300 2022-11-23T01:50:27.3259781Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 301 2022-11-23T01:50:27.3261156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3262004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3263184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3264633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3265804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3266845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3268105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3269322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3270261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:50:27.3271285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:50:27.3272007Z ok (4.057s) 2022-11-23T01:50:27.3272306Z 2022-11-23T01:50:27.3272875Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3273524Z Ran 1 test in 4.057s 2022-11-23T01:50:27.3273887Z 2022-11-23T01:50:27.3274034Z OK 2022-11-23T01:50:27.3274314Z 2022-11-23T01:50:27.3274568Z Generating XML reports... 2022-11-23T01:50:27.3275730Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-CommTest-20221123014921.xml 2022-11-23T01:50:27.3277130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3278244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3279498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3280519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3280987Z 2022-11-23T01:50:27.3281120Z Running tests... 2022-11-23T01:50:27.3281963Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3283095Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3284170Z test_multi_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3284979Z ok (1.710s) 2022-11-23T01:50:27.3285280Z 2022-11-23T01:50:27.3285832Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3286483Z Ran 1 test in 1.711s 2022-11-23T01:50:27.3286832Z 2022-11-23T01:50:27.3287086Z OK 2022-11-23T01:50:27.3287279Z 2022-11-23T01:50:27.3287532Z Generating XML reports... 2022-11-23T01:50:27.3288856Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014927.xml 2022-11-23T01:50:27.3290398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3291344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3292566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3293546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3294001Z 2022-11-23T01:50:27.3294301Z Running tests... 2022-11-23T01:50:27.3295047Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3296184Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3297366Z test_multi_limit_single_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3298314Z ok (1.706s) 2022-11-23T01:50:27.3298610Z 2022-11-23T01:50:27.3299172Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3299803Z Ran 1 test in 1.706s 2022-11-23T01:50:27.3300122Z 2022-11-23T01:50:27.3300353Z OK 2022-11-23T01:50:27.3300577Z 2022-11-23T01:50:27.3300835Z Generating XML reports... 2022-11-23T01:50:27.3302257Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014932.xml 2022-11-23T01:50:27.3304222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3305139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3306574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3307768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3308309Z 2022-11-23T01:50:27.3308469Z Running tests... 2022-11-23T01:50:27.3309330Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3310546Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3311746Z test_single_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3312539Z ok (1.707s) 2022-11-23T01:50:27.3312829Z 2022-11-23T01:50:27.3313389Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3314027Z Ran 1 test in 1.707s 2022-11-23T01:50:27.3314343Z 2022-11-23T01:50:27.3314522Z OK 2022-11-23T01:50:27.3314931Z 2022-11-23T01:50:27.3315170Z Generating XML reports... 2022-11-23T01:50:27.3316612Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014936.xml 2022-11-23T01:50:27.3318280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3319051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3320067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3321065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3321527Z 2022-11-23T01:50:27.3321619Z Running tests... 2022-11-23T01:50:27.3322306Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3323225Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3324049Z test_single_limit_single_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3324693Z ok (1.716s) 2022-11-23T01:50:27.3324940Z 2022-11-23T01:50:27.3325372Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3326055Z Ran 1 test in 1.716s 2022-11-23T01:50:27.3326326Z 2022-11-23T01:50:27.3326473Z OK 2022-11-23T01:50:27.3326691Z 2022-11-23T01:50:27.3326892Z Generating XML reports... 2022-11-23T01:50:27.3327946Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014940.xml 2022-11-23T01:50:27.3329356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3330132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3331109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3331909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3332262Z 2022-11-23T01:50:27.3332449Z Running tests... 2022-11-23T01:50:27.3333109Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3333981Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3334825Z test_backend_class_attr (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3335648Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 763 2022-11-23T01:50:27.3336474Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 764 2022-11-23T01:50:27.3337108Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 765 2022-11-23T01:50:27.3337822Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 766 2022-11-23T01:50:27.3339033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3339917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3341055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3341790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3342779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3343592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3345177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3345946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3347012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3348015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3349049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3350045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3351044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3351769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3352769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3353584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3354358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T01:50:27.3355287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:50:27.3356092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:50:27.3356878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T01:50:27.3357442Z ok (4.125s) 2022-11-23T01:50:27.3357696Z 2022-11-23T01:50:27.3358203Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3358955Z Ran 1 test in 4.126s 2022-11-23T01:50:27.3359229Z 2022-11-23T01:50:27.3359385Z OK 2022-11-23T01:50:27.3359586Z 2022-11-23T01:50:27.3359798Z Generating XML reports... 2022-11-23T01:50:27.3360967Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123014944.xml 2022-11-23T01:50:27.3362594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3363356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3364380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3365338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3365719Z 2022-11-23T01:50:27.3365898Z Running tests... 2022-11-23T01:50:27.3366826Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3367782Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3368731Z test_collectives (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3369557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1094 2022-11-23T01:50:27.3370318Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1095 2022-11-23T01:50:27.3371119Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1096 2022-11-23T01:50:27.3372158Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1097 2022-11-23T01:50:27.3373223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3373979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3375014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3375682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3376858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3377619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3378633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3379616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3380578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3381542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3382548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3383337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3384978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3385808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3386875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3387941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3388667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:50:27.3389789Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T01:50:27.3390921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T01:50:27.3392065Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T01:50:27.3392950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:50:27.3393733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T01:50:27.3394589Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T01:50:27.3395446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:50:27.3396298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T01:50:27.3397259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:50:27.3398381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3399721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3401033Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3402367Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3403011Z ok (7.083s) 2022-11-23T01:50:27.3403241Z 2022-11-23T01:50:27.3403705Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3404348Z Ran 1 test in 7.083s 2022-11-23T01:50:27.3404630Z 2022-11-23T01:50:27.3404770Z OK 2022-11-23T01:50:27.3404973Z 2022-11-23T01:50:27.3405171Z Generating XML reports... 2022-11-23T01:50:27.3406227Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123014950.xml 2022-11-23T01:50:27.3407428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3408168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3409121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3409878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3410233Z 2022-11-23T01:50:27.3410497Z Running tests... 2022-11-23T01:50:27.3411212Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3412194Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3412970Z test_get_backend_name (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3413801Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1434 2022-11-23T01:50:27.3414563Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1435 2022-11-23T01:50:27.3415325Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1436 2022-11-23T01:50:27.3416037Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1437 2022-11-23T01:50:27.3417111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3418042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3419055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3419884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3420901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3421781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3422721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3423469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3425021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3425730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3426702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3427440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3428417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3429392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3430475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3431182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3431921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:50:27.3432681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T01:50:27.3433450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:50:27.3434370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T01:50:27.3434932Z ok (4.154s) 2022-11-23T01:50:27.3435169Z 2022-11-23T01:50:27.3435625Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3436174Z Ran 1 test in 4.154s 2022-11-23T01:50:27.3436439Z 2022-11-23T01:50:27.3436582Z OK 2022-11-23T01:50:27.3436855Z 2022-11-23T01:50:27.3436981Z Generating XML reports... 2022-11-23T01:50:27.3438231Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123015000.xml 2022-11-23T01:50:27.3439543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3440290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3441449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3442407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3442762Z 2022-11-23T01:50:27.3442918Z Running tests... 2022-11-23T01:50:27.3443546Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3444435Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3445288Z test_send_recv (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3446125Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1765 2022-11-23T01:50:27.3446912Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1766 2022-11-23T01:50:27.3447607Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1767 2022-11-23T01:50:27.3448327Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1768 2022-11-23T01:50:27.3449387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3450272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3451274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3452028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3453010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3453744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3454747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3455544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3456533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3457489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3458379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3583219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3584854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3585609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3586539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3587331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3588071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T01:50:27.3589493Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T01:50:27.3590407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:50:27.3591475Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T01:50:27.3592291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T01:50:27.3593420Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T01:50:27.3594343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:50:27.3595188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T01:50:27.3596058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:50:27.3597123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T01:50:27.3598356Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3599395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:50:27.3600483Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3601785Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3603063Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T01:50:27.3603733Z ok (6.967s) 2022-11-23T01:50:27.3603955Z 2022-11-23T01:50:27.3604418Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3604913Z Ran 1 test in 6.967s 2022-11-23T01:50:27.3605156Z 2022-11-23T01:50:27.3605284Z OK 2022-11-23T01:50:27.3605531Z 2022-11-23T01:50:27.3605715Z Generating XML reports... 2022-11-23T01:50:27.3606903Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123015007.xml 2022-11-23T01:50:27.3608183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3608912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3609891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3610701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3611063Z 2022-11-23T01:50:27.3611270Z Running tests... 2022-11-23T01:50:27.3612084Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3613001Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3613791Z test_op_isinstance_of_reduceop (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3614363Z ok (1.730s) 2022-11-23T01:50:27.3614593Z 2022-11-23T01:50:27.3615208Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3615682Z Ran 1 test in 1.730s 2022-11-23T01:50:27.3615926Z 2022-11-23T01:50:27.3616054Z OK 2022-11-23T01:50:27.3616268Z 2022-11-23T01:50:27.3616458Z Generating XML reports... 2022-11-23T01:50:27.3617370Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015016.xml 2022-11-23T01:50:27.3618750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3619545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3620581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3621394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3621972Z 2022-11-23T01:50:27.3622084Z Running tests... 2022-11-23T01:50:27.3622750Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3623644Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3624955Z test_reduceop_copyable (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3625573Z ok (1.725s) 2022-11-23T01:50:27.3625818Z 2022-11-23T01:50:27.3626285Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3626811Z Ran 1 test in 1.725s 2022-11-23T01:50:27.3627067Z 2022-11-23T01:50:27.3627204Z OK 2022-11-23T01:50:27.3627572Z 2022-11-23T01:50:27.3627760Z Generating XML reports... 2022-11-23T01:50:27.3628718Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015020.xml 2022-11-23T01:50:27.3629978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:50:27.3630697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:50:27.3631692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:50:27.3632398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:50:27.3632730Z 2022-11-23T01:50:27.3632894Z Running tests... 2022-11-23T01:50:27.3633567Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3634494Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T01:50:27.3635274Z test_reduceop_pickle (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:50:27.3635825Z ok (1.723s) 2022-11-23T01:50:27.3636064Z 2022-11-23T01:50:27.3636521Z ---------------------------------------------------------------------- 2022-11-23T01:50:27.3637015Z Ran 1 test in 1.723s 2022-11-23T01:50:27.3637282Z 2022-11-23T01:50:27.3637428Z OK 2022-11-23T01:50:27.3637636Z 2022-11-23T01:50:27.3637829Z Generating XML reports... 2022-11-23T01:50:27.3638865Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015024.xml 2022-11-23T01:50:27.3639653Z 2022-11-23T01:50:27.3640420Z ##[endgroup] 2022-11-23T01:50:27.3641371Z FINISHED PRINTING LOG FILE of distributed/test_c10d_common (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_common_3262w7td) 2022-11-23T01:50:27.3641953Z 2022-11-23T01:50:27.6816824Z 2022-11-23T01:50:27.6817350Z real 1m15.332s 2022-11-23T01:50:27.6817655Z user 2m18.187s 2022-11-23T01:50:27.6817931Z sys 1m55.476s 2022-11-23T01:50:27.6818526Z + python test/run_test.py --verbose -i distributed/test_c10d_gloo 2022-11-23T01:50:30.0840059Z Ignoring disabled issues: [] 2022-11-23T01:50:30.1360174Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T01:50:30.1361257Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T01:50:30.1361618Z Selected tests: 2022-11-23T01:50:30.1361891Z distributed/test_c10d_gloo 2022-11-23T01:50:30.1385835Z Prioritized test from test file changes. 2022-11-23T01:50:30.1386425Z reordering tests for PR: 2022-11-23T01:50:30.1386810Z prioritized: [] 2022-11-23T01:50:30.1387182Z the rest: ['distributed/test_c10d_gloo'] 2022-11-23T01:50:30.1387380Z 2022-11-23T01:50:30.1388026Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T01:50:30.1389197Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T01:50:30.1392948Z parallel (file granularity) tests: 2022-11-23T01:50:30.1393325Z 2022-11-23T01:50:30.1393590Z serial (file granularity) tests: 2022-11-23T01:50:30.1393979Z distributed/test_c10d_gloo 2022-11-23T01:50:32.3803447Z Ignoring disabled issues: [] 2022-11-23T01:50:32.7985601Z Running distributed/test_c10d_gloo ... [2022-11-23 01:50:32.797904] 2022-11-23T01:50:32.7986387Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:50:32.798370] 2022-11-23T02:05:07.8938324Z 2022-11-23T02:05:07.8938857Z Expand the folded group to see the log file of distributed/test_c10d_gloo 2022-11-23T02:05:07.8940969Z ##[group]PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_or6yn0rj) 2022-11-23T02:05:07.8943794Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph5ivvos9 2022-11-23T02:05:07.8944682Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph5ivvos9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.8948519Z , <__main__.CommTest testMethod=test_broadcast_coalesced_gloo_cuda>, <__main__.CommTest testMethod=test_gloo_barrier_device_ids>, <__main__.CommTest testMethod=test_gloo_rank_membership>, <__main__.CommTest testMethod=test_gloo_warn_not_in_group>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_gloo>, <__main__.CommTest testMethod=test_sequence_num_set_gloo_new_group>, <__main__.CommTest testMethod=test_tensor_dtype_complex>, <__main__.CommTest testMethod=test_tensor_dtype_mismatch>]> 2022-11-23T02:05:07.8950593Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) 2022-11-23T02:05:07.8951250Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) 2022-11-23T02:05:07.8951864Z test_gloo_barrier_device_ids (__main__.CommTest) 2022-11-23T02:05:07.8952442Z test_gloo_rank_membership (__main__.CommTest) 2022-11-23T02:05:07.8952999Z test_gloo_warn_not_in_group (__main__.CommTest) 2022-11-23T02:05:07.8953620Z test_sequence_num_incremented_gloo_default (__main__.CommTest) 2022-11-23T02:05:07.8954274Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) 2022-11-23T02:05:07.8954922Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) 2022-11-23T02:05:07.8955563Z test_sequence_num_set_gloo_new_group (__main__.CommTest) 2022-11-23T02:05:07.8956160Z test_tensor_dtype_complex (__main__.CommTest) 2022-11-23T02:05:07.8956721Z test_tensor_dtype_mismatch (__main__.CommTest) 2022-11-23T02:05:07.8958971Z , <__main__.CompilerTest testMethod=test_allgather_work_wait_gpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_cpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_gpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_cpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_gpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_cpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_gpu>, <__main__.CompilerTest testMethod=test_nested_comm_tensor_wrapping>, <__main__.CompilerTest testMethod=test_scatter_work_wait_cpu>, <__main__.CompilerTest testMethod=test_scatter_work_wait_gpu>]> 2022-11-23T02:05:07.8961175Z test_allgather_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:05:07.8961807Z test_allgather_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:05:07.8962422Z test_allreduce_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:05:07.8965385Z test_allreduce_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:05:07.8965956Z test_broadcast_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:05:07.8966417Z test_broadcast_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:05:07.8967052Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:05:07.8967419Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:05:07.8967788Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) 2022-11-23T02:05:07.8968368Z test_scatter_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:05:07.8968858Z test_scatter_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:05:07.8975436Z , <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_cpu>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_gloo>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_register_just_once>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_init>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_return_type>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_when_unused_parameters_empty>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output_with_unused_parameters>, <__main__.DistributedDataParallelTest testMethod=test_ignored_sharded_tensor>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_save_load_checkpoint>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-11-23T02:05:07.8982190Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8982712Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8983514Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8984619Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8985306Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8985856Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8986574Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8987186Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8987858Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8988427Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8988994Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8989646Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8990175Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8990671Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8991150Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8991587Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8992058Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8992509Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8992952Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8993440Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8993916Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8994388Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8994862Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8995356Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8995852Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8996303Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8996732Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8997155Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8997600Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8998010Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8998448Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8998889Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8999296Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.8999711Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9000126Z test_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9000566Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9001219Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9001683Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9002991Z , <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_allreduce_coalesced>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_collectives>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_monitored_barrier>]> 2022-11-23T02:05:07.9004092Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:05:07.9004625Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:05:07.9005160Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:05:07.9005763Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:05:07.9006189Z 2022-11-23T02:05:07.9012215Z , <__main__.ProcessGroupGlooTest testMethod=test_allgather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_barrier_implies_wait>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_checks>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_empty_tensors>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_gather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_gather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_multi_device_constructor>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin_create_destroy>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_checks>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_send_recv_all_to_all>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_checks>]> 2022-11-23T02:05:07.9017202Z test_allgather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9017595Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9017954Z test_allgather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9018339Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9018796Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9019217Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9019592Z test_allgather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9019970Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9020344Z test_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9020698Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9021104Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9021528Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9021911Z test_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9022291Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9022690Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9023091Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9023482Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9024316Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9024714Z test_allreduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9025073Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9025461Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9025836Z test_broadcast_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9026193Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9026567Z test_broadcast_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9026935Z test_broadcast_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9027307Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9027656Z test_empty_tensors (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9028019Z test_gather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9028391Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9028731Z test_gather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9029116Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9029495Z test_gather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9029839Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9030228Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9030596Z test_reduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9030964Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9031307Z test_reduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9031662Z test_reduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9032019Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9032367Z test_round_robin (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9032842Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9033231Z test_scatter_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9033585Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9033951Z test_scatter_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9034311Z test_scatter_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9034673Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9035031Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9035414Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9035813Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9036192Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:05:07.9037055Z , <__main__.ReducerTest testMethod=test_forward_backward_optimizer>, <__main__.ReducerTest testMethod=test_forward_backward_unused_parameters>, <__main__.ReducerTest testMethod=test_multi_dtype_multi_bucket>, <__main__.ReducerTest testMethod=test_multi_dtype_single_bucket>, <__main__.ReducerTest testMethod=test_single_dtype_single_bucket>]> 2022-11-23T02:05:07.9037924Z test_forward_backward (__main__.ReducerTest) 2022-11-23T02:05:07.9038275Z test_forward_backward_optimizer (__main__.ReducerTest) 2022-11-23T02:05:07.9038625Z test_forward_backward_unused_parameters (__main__.ReducerTest) 2022-11-23T02:05:07.9038989Z test_multi_dtype_multi_bucket (__main__.ReducerTest) 2022-11-23T02:05:07.9039341Z test_multi_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:05:07.9039674Z test_single_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:05:07.9040101Z ]> 2022-11-23T02:05:07.9040516Z test_logging_init (__main__.RendezvousEnvTest) 2022-11-23T02:05:07.9040849Z 2022-11-23T02:05:07.9041257Z ]> 2022-11-23T02:05:07.9041685Z test_default_store_timeout_gloo (__main__.TimeoutTest) 2022-11-23T02:05:07.9042387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9042833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9043412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9043883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9044354Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ug12z0h 2022-11-23T02:05:07.9044873Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ug12z0h/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9045188Z 2022-11-23T02:05:07.9045301Z Running tests... 2022-11-23T02:05:07.9045715Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9046235Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9046727Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9047199Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2513 2022-11-23T02:05:07.9047653Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2514 2022-11-23T02:05:07.9048237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9048688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9049263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9049810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9050383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9050830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9051402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9051855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9052324Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpisafdqhw 2022-11-23T02:05:07.9052872Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpisafdqhw/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9053386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9053862Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpygd8vswz 2022-11-23T02:05:07.9054464Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpygd8vswz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9054970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9055292Z ok (4.148s) 2022-11-23T02:05:07.9055443Z 2022-11-23T02:05:07.9055727Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9056060Z Ran 1 test in 4.148s 2022-11-23T02:05:07.9056224Z 2022-11-23T02:05:07.9056321Z OK 2022-11-23T02:05:07.9056439Z 2022-11-23T02:05:07.9056564Z Generating XML reports... 2022-11-23T02:05:07.9057106Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015036.xml 2022-11-23T02:05:07.9057970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9058454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9059233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9059713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9060184Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxvrc0n9s 2022-11-23T02:05:07.9060707Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxvrc0n9s/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9061010Z 2022-11-23T02:05:07.9061123Z Running tests... 2022-11-23T02:05:07.9061533Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9062068Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9062535Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9063006Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2718 2022-11-23T02:05:07.9063458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2719 2022-11-23T02:05:07.9064125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9064803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9065463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9065938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9066509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9066954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9067529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9068104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9068574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvo75j6tk 2022-11-23T02:05:07.9069123Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvo75j6tk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9069650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2t0jwtu1 2022-11-23T02:05:07.9070163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2t0jwtu1/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9070678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9071152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9071503Z ok (5.828s) 2022-11-23T02:05:07.9071637Z 2022-11-23T02:05:07.9071918Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9072325Z Ran 1 test in 5.828s 2022-11-23T02:05:07.9072490Z 2022-11-23T02:05:07.9072588Z OK 2022-11-23T02:05:07.9072727Z 2022-11-23T02:05:07.9072835Z Generating XML reports... 2022-11-23T02:05:07.9073395Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015043.xml 2022-11-23T02:05:07.9074058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9074512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9075064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9075537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9076005Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpns_a55g6 2022-11-23T02:05:07.9076527Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpns_a55g6/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9076833Z 2022-11-23T02:05:07.9076958Z Running tests... 2022-11-23T02:05:07.9077374Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9077911Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9078367Z test_gloo_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9078831Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2925 2022-11-23T02:05:07.9079282Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2926 2022-11-23T02:05:07.9079868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9080318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9080897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9081379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9081938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9082391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9082962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9083436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9083884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3jqn476o 2022-11-23T02:05:07.9084431Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3jqn476o/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9084969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgyoajt7l 2022-11-23T02:05:07.9085538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgyoajt7l/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9086066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9086545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9087032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9087508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9088173Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9088872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9089261Z ok (4.157s) 2022-11-23T02:05:07.9089393Z 2022-11-23T02:05:07.9089719Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9090061Z Ran 1 test in 4.157s 2022-11-23T02:05:07.9090225Z 2022-11-23T02:05:07.9090320Z OK 2022-11-23T02:05:07.9090462Z 2022-11-23T02:05:07.9090568Z Generating XML reports... 2022-11-23T02:05:07.9091109Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015051.xml 2022-11-23T02:05:07.9091773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9092223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9092783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9093247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9093719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl0mjg4f_ 2022-11-23T02:05:07.9094244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl0mjg4f_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9094551Z 2022-11-23T02:05:07.9094661Z Running tests... 2022-11-23T02:05:07.9095069Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9095600Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9096065Z test_gloo_rank_membership (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9096530Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3130 2022-11-23T02:05:07.9096972Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3131 2022-11-23T02:05:07.9097556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9098008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9098594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9099070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9099634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9100093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9100678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9101144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9101597Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd74ayouz 2022-11-23T02:05:07.9102141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd74ayouz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9102673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7rpd8jj 2022-11-23T02:05:07.9103244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7rpd8jj/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9103768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9104733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9105226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9105706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9106373Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9107062Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9107680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:05:07.9108153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:05:07.9108797Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9109481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9109869Z ok (4.147s) 2022-11-23T02:05:07.9110000Z 2022-11-23T02:05:07.9110266Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9110596Z Ran 1 test in 4.147s 2022-11-23T02:05:07.9110758Z 2022-11-23T02:05:07.9110849Z OK 2022-11-23T02:05:07.9110963Z 2022-11-23T02:05:07.9111087Z Generating XML reports... 2022-11-23T02:05:07.9111624Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015058.xml 2022-11-23T02:05:07.9112289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9112720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9113291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9113755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9114219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvnab59je 2022-11-23T02:05:07.9114736Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvnab59je/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9115034Z 2022-11-23T02:05:07.9115143Z Running tests... 2022-11-23T02:05:07.9115544Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9116068Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9116523Z test_gloo_warn_not_in_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9116975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3338 2022-11-23T02:05:07.9117412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3339 2022-11-23T02:05:07.9118028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9118473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9119038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9119507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9120060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9120509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9121154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9121612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9122084Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplkfcpt6o 2022-11-23T02:05:07.9122621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplkfcpt6o/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9123127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9123604Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmzm0k8wq 2022-11-23T02:05:07.9124134Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmzm0k8wq/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9124634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9125175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9125646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9126302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9126986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9127497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:05:07.9127981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:05:07.9128620Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9129304Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9129677Z ok (5.748s) 2022-11-23T02:05:07.9129826Z 2022-11-23T02:05:07.9130092Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9130418Z Ran 1 test in 5.748s 2022-11-23T02:05:07.9130579Z 2022-11-23T02:05:07.9130652Z OK 2022-11-23T02:05:07.9130787Z 2022-11-23T02:05:07.9130915Z Generating XML reports... 2022-11-23T02:05:07.9131466Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015105.xml 2022-11-23T02:05:07.9132126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9132556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9133123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9133596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9134057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9khnbcvx 2022-11-23T02:05:07.9134571Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9khnbcvx/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9134870Z 2022-11-23T02:05:07.9134977Z Running tests... 2022-11-23T02:05:07.9135381Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9135893Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9136402Z test_sequence_num_incremented_gloo_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9136885Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3548 2022-11-23T02:05:07.9137339Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3549 2022-11-23T02:05:07.9137986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9138454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9139040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9139493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9140081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9140534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9141111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9141562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9142032Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx6piuuz7 2022-11-23T02:05:07.9142645Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx6piuuz7/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9143139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9143632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptrvg1seq 2022-11-23T02:05:07.9144442Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptrvg1seq/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9144952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9145414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9145910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9146573Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9147269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9147781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:05:07.9148273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:05:07.9148916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9149591Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9149961Z ok (5.828s) 2022-11-23T02:05:07.9150110Z 2022-11-23T02:05:07.9150377Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9150703Z Ran 1 test in 5.828s 2022-11-23T02:05:07.9150867Z 2022-11-23T02:05:07.9150941Z OK 2022-11-23T02:05:07.9151075Z 2022-11-23T02:05:07.9151201Z Generating XML reports... 2022-11-23T02:05:07.9151737Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015113.xml 2022-11-23T02:05:07.9152395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9152823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9153388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9153853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9154364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_l1g710n 2022-11-23T02:05:07.9154899Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_l1g710n/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9155197Z 2022-11-23T02:05:07.9155307Z Running tests... 2022-11-23T02:05:07.9155792Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9156314Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9156810Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9157289Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3761 2022-11-23T02:05:07.9157710Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3762 2022-11-23T02:05:07.9158311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9158759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9159328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9159845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9160423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9160868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9161418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9161882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9162348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqzw5gw5l 2022-11-23T02:05:07.9162889Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqzw5gw5l/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9163401Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz5dq5t90 2022-11-23T02:05:07.9163932Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz5dq5t90/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9164442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9164907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9165327Z ok (4.143s) 2022-11-23T02:05:07.9165478Z 2022-11-23T02:05:07.9165755Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9166085Z Ran 1 test in 4.143s 2022-11-23T02:05:07.9166246Z 2022-11-23T02:05:07.9166319Z OK 2022-11-23T02:05:07.9166452Z 2022-11-23T02:05:07.9166575Z Generating XML reports... 2022-11-23T02:05:07.9167110Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015121.xml 2022-11-23T02:05:07.9167769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9168198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9168774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9169240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9169681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjdkbask9 2022-11-23T02:05:07.9170222Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjdkbask9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9170522Z 2022-11-23T02:05:07.9170631Z Running tests... 2022-11-23T02:05:07.9171033Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9171541Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9172027Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9172495Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3960 2022-11-23T02:05:07.9172988Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3961 2022-11-23T02:05:07.9173594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9174048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9174619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9175070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9175642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9176168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9176715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9177234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9177702Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0922erh0 2022-11-23T02:05:07.9178236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0922erh0/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9178736Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1y0600qr 2022-11-23T02:05:07.9179264Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1y0600qr/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9179769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9180235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9180698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9181184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9181841Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9182505Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9182899Z ok (4.160s) 2022-11-23T02:05:07.9183046Z 2022-11-23T02:05:07.9183310Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9183637Z Ran 1 test in 4.161s 2022-11-23T02:05:07.9183778Z 2022-11-23T02:05:07.9184116Z OK 2022-11-23T02:05:07.9184261Z 2022-11-23T02:05:07.9184387Z Generating XML reports... 2022-11-23T02:05:07.9184937Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015128.xml 2022-11-23T02:05:07.9185575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9186030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9186601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9187066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9187509Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp93m4hpxo 2022-11-23T02:05:07.9188044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp93m4hpxo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9188341Z 2022-11-23T02:05:07.9188452Z Running tests... 2022-11-23T02:05:07.9188836Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9189364Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9189841Z test_sequence_num_set_gloo_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9190310Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4165 2022-11-23T02:05:07.9190866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4166 2022-11-23T02:05:07.9191480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9191933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9192505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9192952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9193523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9193968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9194514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9195073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9195531Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8d6he89k 2022-11-23T02:05:07.9196069Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8d6he89k/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9196575Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg9et6w5e 2022-11-23T02:05:07.9197108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg9et6w5e/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9197608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9198063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9198540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9199029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9199687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9200355Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9200881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:05:07.9201365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:05:07.9202002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9202655Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:05:07.9203049Z ok (4.120s) 2022-11-23T02:05:07.9203195Z 2022-11-23T02:05:07.9203465Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9203776Z Ran 1 test in 4.120s 2022-11-23T02:05:07.9203937Z 2022-11-23T02:05:07.9204030Z OK 2022-11-23T02:05:07.9204163Z 2022-11-23T02:05:07.9204286Z Generating XML reports... 2022-11-23T02:05:07.9204821Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015134.xml 2022-11-23T02:05:07.9205452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9205901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9206468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9206915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9207389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb6n7ivqy 2022-11-23T02:05:07.9207971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb6n7ivqy/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9208274Z 2022-11-23T02:05:07.9208385Z Running tests... 2022-11-23T02:05:07.9208772Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9209299Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9209766Z test_tensor_dtype_complex (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9210204Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4376 2022-11-23T02:05:07.9210644Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4377 2022-11-23T02:05:07.9211243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9211843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9212400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9212867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9213443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9213882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9214430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9214890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9215354Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj5s9yw2n 2022-11-23T02:05:07.9215872Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj5s9yw2n/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9216408Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoqchqcvz 2022-11-23T02:05:07.9216941Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoqchqcvz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9217445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9217896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9218377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9219029Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9219564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9220190Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9220581Z ok (4.167s) 2022-11-23T02:05:07.9220733Z 2022-11-23T02:05:07.9221008Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9221319Z Ran 1 test in 4.167s 2022-11-23T02:05:07.9221479Z 2022-11-23T02:05:07.9221570Z OK 2022-11-23T02:05:07.9221702Z 2022-11-23T02:05:07.9221825Z Generating XML reports... 2022-11-23T02:05:07.9222345Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015141.xml 2022-11-23T02:05:07.9223002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9223450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9224212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9224668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9225207Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhcq2vow 2022-11-23T02:05:07.9225751Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhcq2vow/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9226049Z 2022-11-23T02:05:07.9226158Z Running tests... 2022-11-23T02:05:07.9226553Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9227081Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9227550Z test_tensor_dtype_mismatch (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9227985Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4581 2022-11-23T02:05:07.9228428Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4582 2022-11-23T02:05:07.9229030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9229629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9230184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9230655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9231233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9231677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9232224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9232684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9233145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoe3igc2b 2022-11-23T02:05:07.9233663Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoe3igc2b/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9234176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9234672Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcvauy2jo 2022-11-23T02:05:07.9235201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcvauy2jo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9235686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9236168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9236657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9237290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9237984Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9239029Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9239666Z warnings.warn( 2022-11-23T02:05:07.9240528Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9241115Z warnings.warn( 2022-11-23T02:05:07.9242027Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9242658Z warnings.warn( 2022-11-23T02:05:07.9243517Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9244110Z warnings.warn( 2022-11-23T02:05:07.9244349Z ok (4.246s) 2022-11-23T02:05:07.9244497Z 2022-11-23T02:05:07.9244767Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9245077Z Ran 1 test in 4.247s 2022-11-23T02:05:07.9245243Z 2022-11-23T02:05:07.9245337Z OK 2022-11-23T02:05:07.9245470Z 2022-11-23T02:05:07.9245594Z Generating XML reports... 2022-11-23T02:05:07.9246131Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015148.xml 2022-11-23T02:05:07.9246829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9247396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9247975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9248425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9248891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwl0x6kiw 2022-11-23T02:05:07.9249429Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwl0x6kiw/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9249730Z 2022-11-23T02:05:07.9249840Z Running tests... 2022-11-23T02:05:07.9250231Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9250761Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9251251Z test_allgather_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9251719Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4786 2022-11-23T02:05:07.9252145Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4787 2022-11-23T02:05:07.9252745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9253192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9253744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9254213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9254788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9255239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9255793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9256255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9256716Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkjb54hmo 2022-11-23T02:05:07.9257235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkjb54hmo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9257741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9258235Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc5h7t_d4 2022-11-23T02:05:07.9258759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc5h7t_d4/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9259243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9259786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9260291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9260950Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9261616Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9262539Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9263251Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9264394Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9265107Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9265537Z ok (4.256s) 2022-11-23T02:05:07.9265688Z 2022-11-23T02:05:07.9265963Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9266300Z Ran 1 test in 4.257s 2022-11-23T02:05:07.9266441Z 2022-11-23T02:05:07.9266534Z OK 2022-11-23T02:05:07.9266667Z 2022-11-23T02:05:07.9266790Z Generating XML reports... 2022-11-23T02:05:07.9267347Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015154.xml 2022-11-23T02:05:07.9267998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9268456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9269029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9269505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9269950Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps099xe8h 2022-11-23T02:05:07.9270485Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps099xe8h/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9270782Z 2022-11-23T02:05:07.9270892Z Running tests... 2022-11-23T02:05:07.9271373Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9271902Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9272384Z test_allgather_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9272858Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4991 2022-11-23T02:05:07.9273281Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4992 2022-11-23T02:05:07.9273937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9274380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9274962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9275406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9275979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9276419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9276981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9277508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9277979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfm0tnquh 2022-11-23T02:05:07.9278512Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfm0tnquh/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9279025Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu79ad1mz 2022-11-23T02:05:07.9279553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu79ad1mz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9280053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9280516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9280980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9281544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9282198Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9282881Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9283787Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9284504Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9285342Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9286049Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9286353Z ok (5.791s) 2022-11-23T02:05:07.9286500Z 2022-11-23T02:05:07.9286765Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9287092Z Ran 1 test in 5.791s 2022-11-23T02:05:07.9287249Z 2022-11-23T02:05:07.9287323Z OK 2022-11-23T02:05:07.9287456Z 2022-11-23T02:05:07.9287578Z Generating XML reports... 2022-11-23T02:05:07.9288125Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015201.xml 2022-11-23T02:05:07.9288784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9289210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9289776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9290246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9290690Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpigwa87sg 2022-11-23T02:05:07.9291226Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpigwa87sg/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9291521Z 2022-11-23T02:05:07.9291627Z Running tests... 2022-11-23T02:05:07.9292028Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9292535Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9293015Z test_allreduce_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9293475Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5198 2022-11-23T02:05:07.9293916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5199 2022-11-23T02:05:07.9294577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9295026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9295591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9296040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9296608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9297047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9297611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9298055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9298576Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxalna1uo 2022-11-23T02:05:07.9299112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxalna1uo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9299599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9300090Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp__kiaz2m 2022-11-23T02:05:07.9300616Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp__kiaz2m/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9301111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9301576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9302060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9302719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9303397Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9304582Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9305309Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9306146Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9306845Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9307672Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9308380Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9309216Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9309916Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9310221Z ok (4.116s) 2022-11-23T02:05:07.9310370Z 2022-11-23T02:05:07.9310636Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9310966Z Ran 1 test in 4.116s 2022-11-23T02:05:07.9311124Z 2022-11-23T02:05:07.9311329Z OK 2022-11-23T02:05:07.9311453Z 2022-11-23T02:05:07.9311577Z Generating XML reports... 2022-11-23T02:05:07.9312136Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015209.xml 2022-11-23T02:05:07.9312798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9313221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9313787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9314247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9314703Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp26i7fgoc 2022-11-23T02:05:07.9315221Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp26i7fgoc/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9315586Z 2022-11-23T02:05:07.9315699Z Running tests... 2022-11-23T02:05:07.9316102Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9316630Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9317091Z test_allreduce_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9317549Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5403 2022-11-23T02:05:07.9317989Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5404 2022-11-23T02:05:07.9318567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9319015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9319568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9320016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9320568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9321031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9321611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9322055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9322517Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpazqon2l5 2022-11-23T02:05:07.9323049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpazqon2l5/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9323572Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4urp8vyl 2022-11-23T02:05:07.9324085Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4urp8vyl/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9324586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9325050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9325529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9325997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9326647Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9327324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9328269Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9328994Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9329832Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9330533Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9331364Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9332038Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9332926Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9333621Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9333940Z ok (5.732s) 2022-11-23T02:05:07.9334085Z 2022-11-23T02:05:07.9334334Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9334661Z Ran 1 test in 5.733s 2022-11-23T02:05:07.9334819Z 2022-11-23T02:05:07.9334911Z OK 2022-11-23T02:05:07.9335042Z 2022-11-23T02:05:07.9335148Z Generating XML reports... 2022-11-23T02:05:07.9335700Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015216.xml 2022-11-23T02:05:07.9336361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9336810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9337363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9337824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9338288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo90h2wxv 2022-11-23T02:05:07.9338821Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo90h2wxv/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9339101Z 2022-11-23T02:05:07.9339208Z Running tests... 2022-11-23T02:05:07.9339606Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9340128Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9340592Z test_broadcast_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9341053Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5610 2022-11-23T02:05:07.9341492Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5611 2022-11-23T02:05:07.9342091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9342519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9343084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9343547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9344369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9344812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9345449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9345913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9346348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphc54ezbh 2022-11-23T02:05:07.9346881Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphc54ezbh/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9347400Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmiqi466j 2022-11-23T02:05:07.9347909Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmiqi466j/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9348407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9348870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9349413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9350056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9350586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9351227Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9352141Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9352837Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9353676Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9354378Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9354700Z ok (4.137s) 2022-11-23T02:05:07.9354846Z 2022-11-23T02:05:07.9355094Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9355417Z Ran 1 test in 4.138s 2022-11-23T02:05:07.9355574Z 2022-11-23T02:05:07.9355664Z OK 2022-11-23T02:05:07.9355795Z 2022-11-23T02:05:07.9355917Z Generating XML reports... 2022-11-23T02:05:07.9356450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015224.xml 2022-11-23T02:05:07.9357114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9357557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9358116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9358581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9359042Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ovyh34w 2022-11-23T02:05:07.9359574Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ovyh34w/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9359854Z 2022-11-23T02:05:07.9359960Z Running tests... 2022-11-23T02:05:07.9360357Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9360883Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9361339Z test_broadcast_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9361799Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5815 2022-11-23T02:05:07.9362296Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5816 2022-11-23T02:05:07.9362905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9363330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9363967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9364435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9364988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9365515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9366083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9366595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9367040Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnncl_uym 2022-11-23T02:05:07.9367573Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnncl_uym/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9368099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrr1b2mp 2022-11-23T02:05:07.9368625Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrr1b2mp/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9369111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9369579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9370058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9370525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9371184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9371868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9372775Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9373468Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9374307Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9375011Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9375334Z ok (5.726s) 2022-11-23T02:05:07.9375480Z 2022-11-23T02:05:07.9375726Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9376052Z Ran 1 test in 5.727s 2022-11-23T02:05:07.9376213Z 2022-11-23T02:05:07.9376303Z OK 2022-11-23T02:05:07.9376435Z 2022-11-23T02:05:07.9376564Z Generating XML reports... 2022-11-23T02:05:07.9377098Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015231.xml 2022-11-23T02:05:07.9377760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9378202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9378749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9379269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9379734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps2mss6q4 2022-11-23T02:05:07.9380266Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps2mss6q4/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9380564Z 2022-11-23T02:05:07.9380655Z Running tests... 2022-11-23T02:05:07.9381058Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9381580Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9382049Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9382518Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6022 2022-11-23T02:05:07.9382958Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6023 2022-11-23T02:05:07.9383615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9384306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9384879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9385345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9385916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9386339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9386900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9387357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9387797Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx9lq4rwm 2022-11-23T02:05:07.9388341Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx9lq4rwm/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9388845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9389337Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwgql03_k 2022-11-23T02:05:07.9389851Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwgql03_k/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9390350Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9390825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9391295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9391938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9392624Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9393532Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9394239Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9395055Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9395758Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9396669Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9397385Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9398207Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9398904Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9399731Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9400501Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9401332Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9402007Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9402841Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9403534Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9404369Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9405050Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9405368Z ok (4.123s) 2022-11-23T02:05:07.9405515Z 2022-11-23T02:05:07.9405779Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9406105Z Ran 1 test in 4.123s 2022-11-23T02:05:07.9406247Z 2022-11-23T02:05:07.9406337Z OK 2022-11-23T02:05:07.9406467Z 2022-11-23T02:05:07.9406590Z Generating XML reports... 2022-11-23T02:05:07.9407138Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015239.xml 2022-11-23T02:05:07.9407789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9408235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9408810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9409269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9409710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp25wm8r4k 2022-11-23T02:05:07.9410244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp25wm8r4k/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9410537Z 2022-11-23T02:05:07.9410644Z Running tests... 2022-11-23T02:05:07.9411027Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9411551Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9412038Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9412505Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6227 2022-11-23T02:05:07.9413001Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6228 2022-11-23T02:05:07.9413608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9414054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9414599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9415064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9415637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9416078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9416624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9417146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9417607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4xgmmgi7 2022-11-23T02:05:07.9418138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4xgmmgi7/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9418644Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphkwnbfvj 2022-11-23T02:05:07.9419171Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphkwnbfvj/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9419676Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9420132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9420614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9421112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9421775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9422483Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9423405Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9424367Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9425216Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9425912Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9426750Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9427451Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9428286Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9428977Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9429862Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9430580Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9431410Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9432103Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9432910Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9433691Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9434523Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9435217Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9435522Z ok (5.724s) 2022-11-23T02:05:07.9435668Z 2022-11-23T02:05:07.9435934Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9436262Z Ran 1 test in 5.724s 2022-11-23T02:05:07.9436419Z 2022-11-23T02:05:07.9436509Z OK 2022-11-23T02:05:07.9436622Z 2022-11-23T02:05:07.9436744Z Generating XML reports... 2022-11-23T02:05:07.9437292Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015246.xml 2022-11-23T02:05:07.9437967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9438396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9438961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9439424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9439884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaygr9483 2022-11-23T02:05:07.9440397Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaygr9483/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9440694Z 2022-11-23T02:05:07.9440801Z Running tests... 2022-11-23T02:05:07.9441200Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9441709Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9442196Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9442662Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6434 2022-11-23T02:05:07.9443099Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6435 2022-11-23T02:05:07.9443681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9444124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9444690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9445154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9445702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9446138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9446761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9447209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9447671Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd07zhth_ 2022-11-23T02:05:07.9448267Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd07zhth_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9448768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9449245Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprgefvmye 2022-11-23T02:05:07.9449776Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprgefvmye/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9450277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9450796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9451280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9451933Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9452609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9453507Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9454293Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9455154Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9455856Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9456686Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9492106Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9493029Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9493817Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9494165Z ok (4.123s) 2022-11-23T02:05:07.9494323Z 2022-11-23T02:05:07.9494614Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9494943Z Ran 1 test in 4.123s 2022-11-23T02:05:07.9495115Z 2022-11-23T02:05:07.9495209Z OK 2022-11-23T02:05:07.9495345Z 2022-11-23T02:05:07.9495467Z Generating XML reports... 2022-11-23T02:05:07.9496031Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015254.xml 2022-11-23T02:05:07.9496747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9497226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9497835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9498317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9499024Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbh1k1d3x 2022-11-23T02:05:07.9499608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbh1k1d3x/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9499925Z 2022-11-23T02:05:07.9500038Z Running tests... 2022-11-23T02:05:07.9500451Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9501012Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9501512Z test_scatter_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9501981Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6639 2022-11-23T02:05:07.9502461Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6640 2022-11-23T02:05:07.9503191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9503779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9504684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9505163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9505725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9506152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9506706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9507165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9507636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5rqkhdui 2022-11-23T02:05:07.9508168Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5rqkhdui/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9508675Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9509176Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpojb1zz4c 2022-11-23T02:05:07.9509688Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpojb1zz4c/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9510193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9510678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9511175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9511813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9512508Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9513426Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9514138Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9514965Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9515668Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9515998Z ok (4.150s) 2022-11-23T02:05:07.9516151Z 2022-11-23T02:05:07.9516421Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9516810Z Ran 1 test in 4.150s 2022-11-23T02:05:07.9516976Z 2022-11-23T02:05:07.9517069Z OK 2022-11-23T02:05:07.9517202Z 2022-11-23T02:05:07.9517327Z Generating XML reports... 2022-11-23T02:05:07.9517864Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015300.xml 2022-11-23T02:05:07.9518534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9518982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9519605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9520057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9520521Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd4n1hjh1 2022-11-23T02:05:07.9521129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd4n1hjh1/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9521428Z 2022-11-23T02:05:07.9521537Z Running tests... 2022-11-23T02:05:07.9521927Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9522453Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9522913Z test_scatter_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9523356Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6844 2022-11-23T02:05:07.9523808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6845 2022-11-23T02:05:07.9524409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9524857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9525420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9525887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9526462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9526883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9527444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9527901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9528357Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvuan0ryk 2022-11-23T02:05:07.9528878Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvuan0ryk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9529414Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpog78t9hk 2022-11-23T02:05:07.9529954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpog78t9hk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9530465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9530919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9531393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9532049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9532560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9533207Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9534176Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9534898Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9535730Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:05:07.9536440Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:05:07.9536767Z ok (5.730s) 2022-11-23T02:05:07.9536916Z 2022-11-23T02:05:07.9537189Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9537507Z Ran 1 test in 5.731s 2022-11-23T02:05:07.9537728Z 2022-11-23T02:05:07.9537828Z OK 2022-11-23T02:05:07.9537962Z 2022-11-23T02:05:07.9538092Z Generating XML reports... 2022-11-23T02:05:07.9538723Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015307.xml 2022-11-23T02:05:07.9539371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9539821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9540382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9540830Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9541297Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzdhuufo7 2022-11-23T02:05:07.9541569Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzdhuufo7/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9541593Z 2022-11-23T02:05:07.9541704Z Running tests... 2022-11-23T02:05:07.9541978Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9542291Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9542520Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9542974Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9543194Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7051 2022-11-23T02:05:07.9543409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7052 2022-11-23T02:05:07.9543779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9544219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9544622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9544816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9545184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9545341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9545713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9545902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9546159Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo5mzqnbv 2022-11-23T02:05:07.9546434Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo5mzqnbv/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9546688Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqoio7l85 2022-11-23T02:05:07.9547051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqoio7l85/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9547287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9547515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9547598Z ok (6.243s) 2022-11-23T02:05:07.9547618Z 2022-11-23T02:05:07.9547886Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9547998Z Ran 1 test in 6.243s 2022-11-23T02:05:07.9548018Z 2022-11-23T02:05:07.9548111Z OK 2022-11-23T02:05:07.9548130Z 2022-11-23T02:05:07.9548254Z Generating XML reports... 2022-11-23T02:05:07.9548714Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015315.xml 2022-11-23T02:05:07.9549082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9549355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9549716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9549905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9550161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvxcbo2u 2022-11-23T02:05:07.9550430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvxcbo2u/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9550450Z 2022-11-23T02:05:07.9550557Z Running tests... 2022-11-23T02:05:07.9550822Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9551131Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9551373Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9551647Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9551844Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7266 2022-11-23T02:05:07.9552061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7267 2022-11-23T02:05:07.9552436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9552611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9552986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9553177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9553538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9553721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9554074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9554264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9554518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1y__04lz 2022-11-23T02:05:07.9554783Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1y__04lz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9555034Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprre8nl78 2022-11-23T02:05:07.9555300Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprre8nl78/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9555527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9555758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9555909Z ok (6.237s) 2022-11-23T02:05:07.9555930Z 2022-11-23T02:05:07.9556183Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9556295Z Ran 1 test in 6.237s 2022-11-23T02:05:07.9556315Z 2022-11-23T02:05:07.9556405Z OK 2022-11-23T02:05:07.9556424Z 2022-11-23T02:05:07.9556547Z Generating XML reports... 2022-11-23T02:05:07.9557005Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015324.xml 2022-11-23T02:05:07.9557372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9557547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9557918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9558138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9558394Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9sw2lj1y 2022-11-23T02:05:07.9558661Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9sw2lj1y/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9558681Z 2022-11-23T02:05:07.9558790Z Running tests... 2022-11-23T02:05:07.9559053Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9559366Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9559608Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9559856Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9560075Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7481 2022-11-23T02:05:07.9560271Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7482 2022-11-23T02:05:07.9560645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9560822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9561200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9561391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9561749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9561922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9562295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9562464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9562726Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpke5d9nnz 2022-11-23T02:05:07.9562998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpke5d9nnz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9563225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9563476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_n5hlklm 2022-11-23T02:05:07.9563742Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_n5hlklm/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9563969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9564204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9564438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9564655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9564925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9565962Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9566080Z warnings.warn( 2022-11-23T02:05:07.9566989Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9567150Z warnings.warn( 2022-11-23T02:05:07.9567387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9567615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9567845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9568078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9568288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9568513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9568734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9568959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9569065Z ok (6.436s) 2022-11-23T02:05:07.9569085Z 2022-11-23T02:05:07.9569351Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9569464Z Ran 1 test in 6.436s 2022-11-23T02:05:07.9569484Z 2022-11-23T02:05:07.9569575Z OK 2022-11-23T02:05:07.9569594Z 2022-11-23T02:05:07.9569700Z Generating XML reports... 2022-11-23T02:05:07.9570162Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015333.xml 2022-11-23T02:05:07.9570530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9570707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9571085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9571278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9571543Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk4t0d3vr 2022-11-23T02:05:07.9571815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk4t0d3vr/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9571835Z 2022-11-23T02:05:07.9571943Z Running tests... 2022-11-23T02:05:07.9572188Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9572500Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9572740Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9572989Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9573205Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7696 2022-11-23T02:05:07.9573422Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7697 2022-11-23T02:05:07.9573841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9574023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9574384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9574576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9574940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9575115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9575485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9575753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9576010Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphqiqxgk0 2022-11-23T02:05:07.9576337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphqiqxgk0/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9576592Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgy6zpupw 2022-11-23T02:05:07.9576801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9577071Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgy6zpupw/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9577297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9577532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9577764Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9577996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9578233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9579142Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9579260Z warnings.warn( 2022-11-23T02:05:07.9580161Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9580258Z warnings.warn( 2022-11-23T02:05:07.9580495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9580728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9580962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9581187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9581417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9581641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9581863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9582087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9582170Z ok (6.406s) 2022-11-23T02:05:07.9582193Z 2022-11-23T02:05:07.9582507Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9582623Z Ran 1 test in 6.406s 2022-11-23T02:05:07.9582643Z 2022-11-23T02:05:07.9582734Z OK 2022-11-23T02:05:07.9582753Z 2022-11-23T02:05:07.9582877Z Generating XML reports... 2022-11-23T02:05:07.9583337Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015342.xml 2022-11-23T02:05:07.9583705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9584157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9584536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9584802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9585062Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw3szi8pz 2022-11-23T02:05:07.9585435Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw3szi8pz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9585456Z 2022-11-23T02:05:07.9585566Z Running tests... 2022-11-23T02:05:07.9585833Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9586144Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9586408Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9586755Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9586953Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7911 2022-11-23T02:05:07.9587169Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7912 2022-11-23T02:05:07.9587537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9587720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9588100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9588290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9588652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9588826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9589182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9589370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9589625Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjawultgp 2022-11-23T02:05:07.9589904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjawultgp/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9590154Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoaisxd9n 2022-11-23T02:05:07.9590420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoaisxd9n/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9590651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9590878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9591115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9591330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9591561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9591794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9591956Z ok (6.248s) 2022-11-23T02:05:07.9591978Z 2022-11-23T02:05:07.9592251Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9592363Z Ran 1 test in 6.248s 2022-11-23T02:05:07.9592383Z 2022-11-23T02:05:07.9592477Z OK 2022-11-23T02:05:07.9592496Z 2022-11-23T02:05:07.9592619Z Generating XML reports... 2022-11-23T02:05:07.9593059Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015350.xml 2022-11-23T02:05:07.9593427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9593605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9593982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9594220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9594479Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpodsmrvag 2022-11-23T02:05:07.9594751Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpodsmrvag/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9594771Z 2022-11-23T02:05:07.9594879Z Running tests... 2022-11-23T02:05:07.9595127Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9595437Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9595703Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9596046Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9596263Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8126 2022-11-23T02:05:07.9596482Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8127 2022-11-23T02:05:07.9596852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9597029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9597404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9597577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9597939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9598112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9598481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9598672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9598932Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv_xxidsx 2022-11-23T02:05:07.9599199Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv_xxidsx/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9599452Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk4kvgg08 2022-11-23T02:05:07.9599717Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk4kvgg08/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9599925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9600146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9600382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9600617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9600854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9601131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9601237Z ok (6.320s) 2022-11-23T02:05:07.9601257Z 2022-11-23T02:05:07.9601526Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9601619Z Ran 1 test in 6.320s 2022-11-23T02:05:07.9601638Z 2022-11-23T02:05:07.9601730Z OK 2022-11-23T02:05:07.9601749Z 2022-11-23T02:05:07.9601873Z Generating XML reports... 2022-11-23T02:05:07.9602333Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015359.xml 2022-11-23T02:05:07.9602707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9602892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9603277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9603513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9603748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnk7hqlam 2022-11-23T02:05:07.9604019Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnk7hqlam/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9604039Z 2022-11-23T02:05:07.9604148Z Running tests... 2022-11-23T02:05:07.9604415Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9604777Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9605024Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9605393Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9605613Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8341 2022-11-23T02:05:07.9605831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8342 2022-11-23T02:05:07.9606180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9606356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9606732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9606925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9607287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9607461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9607832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9608027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9608264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphp5qs6sk 2022-11-23T02:05:07.9608538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphp5qs6sk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9608789Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbt1ebk89 2022-11-23T02:05:07.9609054Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbt1ebk89/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9609280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9609506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9609741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9610037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9610819Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:07.9611594Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:07.9611873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9612101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9612184Z ok (6.355s) 2022-11-23T02:05:07.9612222Z 2022-11-23T02:05:07.9612473Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9612585Z Ran 1 test in 6.356s 2022-11-23T02:05:07.9612604Z 2022-11-23T02:05:07.9612695Z OK 2022-11-23T02:05:07.9612713Z 2022-11-23T02:05:07.9612840Z Generating XML reports... 2022-11-23T02:05:07.9613302Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015408.xml 2022-11-23T02:05:07.9613673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9613857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9614236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9614405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9614658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5cbg66ta 2022-11-23T02:05:07.9614928Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5cbg66ta/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9614948Z 2022-11-23T02:05:07.9615058Z Running tests... 2022-11-23T02:05:07.9615325Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9615635Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9615878Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9616258Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9616477Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8556 2022-11-23T02:05:07.9616671Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8557 2022-11-23T02:05:07.9617042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9617219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9617599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9617790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9618155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9618379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9618760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9618931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9619186Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpguh0t0q7 2022-11-23T02:05:07.9619597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpguh0t0q7/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9619849Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzfzmx101 2022-11-23T02:05:07.9620114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzfzmx101/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9620341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9620612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9620718Z ok (6.213s) 2022-11-23T02:05:07.9620737Z 2022-11-23T02:05:07.9621007Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9621099Z Ran 1 test in 6.213s 2022-11-23T02:05:07.9621118Z 2022-11-23T02:05:07.9621208Z OK 2022-11-23T02:05:07.9621227Z 2022-11-23T02:05:07.9621349Z Generating XML reports... 2022-11-23T02:05:07.9621806Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015417.xml 2022-11-23T02:05:07.9622247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9622422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9622796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9622990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9623227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0_4p11j3 2022-11-23T02:05:07.9623489Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0_4p11j3/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9623508Z 2022-11-23T02:05:07.9623616Z Running tests... 2022-11-23T02:05:07.9624120Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9624446Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9624685Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9624955Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9625170Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8771 2022-11-23T02:05:07.9625370Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8772 2022-11-23T02:05:07.9625743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9625922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9626299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9626491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9626849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9627022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9627394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9627585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9627894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmperp6aqcx 2022-11-23T02:05:07.9628172Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmperp6aqcx/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9628424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz8r0ih_z 2022-11-23T02:05:07.9628691Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz8r0ih_z/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9628917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9629139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9629373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9629605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9629886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9630116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9630220Z ok (6.243s) 2022-11-23T02:05:07.9630240Z 2022-11-23T02:05:07.9630508Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9630621Z Ran 1 test in 6.244s 2022-11-23T02:05:07.9630640Z 2022-11-23T02:05:07.9630731Z OK 2022-11-23T02:05:07.9630750Z 2022-11-23T02:05:07.9630875Z Generating XML reports... 2022-11-23T02:05:07.9631332Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015425.xml 2022-11-23T02:05:07.9631704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9631865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9632247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9632439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9632695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1vm0n97i 2022-11-23T02:05:07.9632963Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1vm0n97i/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9632982Z 2022-11-23T02:05:07.9633090Z Running tests... 2022-11-23T02:05:07.9633353Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9633663Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9633902Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9634172Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9634399Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8986 2022-11-23T02:05:07.9634612Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8987 2022-11-23T02:05:07.9634981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9635157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9635533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9635724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9636087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9636244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9636611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9636848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9637112Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptrpyutdu 2022-11-23T02:05:07.9637386Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptrpyutdu/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9637638Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpif30qw2p 2022-11-23T02:05:07.9637905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpif30qw2p/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9638131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9638343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9639124Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:07.9639942Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:07.9640856Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9640972Z warnings.warn( 2022-11-23T02:05:07.9641873Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9641967Z warnings.warn( 2022-11-23T02:05:07.9642206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9642447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9642681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9642915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9643016Z ok (6.341s) 2022-11-23T02:05:07.9643035Z 2022-11-23T02:05:07.9643307Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9643422Z Ran 1 test in 6.342s 2022-11-23T02:05:07.9643447Z 2022-11-23T02:05:07.9643520Z OK 2022-11-23T02:05:07.9643559Z 2022-11-23T02:05:07.9643665Z Generating XML reports... 2022-11-23T02:05:07.9644126Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015434.xml 2022-11-23T02:05:07.9644497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9644679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9645103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9645300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9645555Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpthjsborm 2022-11-23T02:05:07.9645825Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpthjsborm/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9645844Z 2022-11-23T02:05:07.9645934Z Running tests... 2022-11-23T02:05:07.9646197Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9646507Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9646763Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9647086Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9647304Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9201 2022-11-23T02:05:07.9647521Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9202 2022-11-23T02:05:07.9647894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9648050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9648424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9648615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9648978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9649158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9649530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9649721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9649977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5t24xgk 2022-11-23T02:05:07.9650245Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5t24xgk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9650453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9650704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpclkhhp9g 2022-11-23T02:05:07.9650972Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpclkhhp9g/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9651197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9652108Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9652223Z warnings.warn( 2022-11-23T02:05:07.9653123Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9653233Z warnings.warn( 2022-11-23T02:05:07.9653473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9653758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9653977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9654208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9654309Z ok (6.248s) 2022-11-23T02:05:07.9654329Z 2022-11-23T02:05:07.9654600Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9654710Z Ran 1 test in 6.248s 2022-11-23T02:05:07.9654730Z 2022-11-23T02:05:07.9654820Z OK 2022-11-23T02:05:07.9654839Z 2022-11-23T02:05:07.9654965Z Generating XML reports... 2022-11-23T02:05:07.9655424Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015443.xml 2022-11-23T02:05:07.9655794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9656060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9656440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9656633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9656889Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw6tl_7h4 2022-11-23T02:05:07.9657159Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw6tl_7h4/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9657179Z 2022-11-23T02:05:07.9657288Z Running tests... 2022-11-23T02:05:07.9657554Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9657864Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9658105Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9658349Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9658566Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9416 2022-11-23T02:05:07.9658783Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9417 2022-11-23T02:05:07.9659154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9659334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9659708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9659900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9660262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9660422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9660797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9660986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9661240Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3lddtnsm 2022-11-23T02:05:07.9661510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3lddtnsm/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9661739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9661992Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwz428hn_ 2022-11-23T02:05:07.9662255Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwz428hn_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9662462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9662750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9662989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9663213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9663446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9663549Z ok (6.322s) 2022-11-23T02:05:07.9663568Z 2022-11-23T02:05:07.9663843Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9664246Z Ran 1 test in 6.322s 2022-11-23T02:05:07.9664267Z 2022-11-23T02:05:07.9664339Z OK 2022-11-23T02:05:07.9664378Z 2022-11-23T02:05:07.9664485Z Generating XML reports... 2022-11-23T02:05:07.9664949Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015452.xml 2022-11-23T02:05:07.9665467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9665662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9666043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9666235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9666491Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmporx9xqhc 2022-11-23T02:05:07.9666759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmporx9xqhc/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9666779Z 2022-11-23T02:05:07.9666867Z Running tests... 2022-11-23T02:05:07.9667132Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9667441Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9667708Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9667945Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9668163Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9631 2022-11-23T02:05:07.9668380Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9632 2022-11-23T02:05:07.9668749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9668905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9669279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9669468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9669833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9670006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9670381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9670568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9670825Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplyj34ue9 2022-11-23T02:05:07.9671095Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplyj34ue9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9671301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9671555Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkqrp1e1y 2022-11-23T02:05:07.9671822Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkqrp1e1y/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9672132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9672376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9672608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9672840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9673069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9673282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9673510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9673736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9674007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9674110Z ok (6.365s) 2022-11-23T02:05:07.9674130Z 2022-11-23T02:05:07.9676258Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9676373Z Ran 1 test in 6.365s 2022-11-23T02:05:07.9676393Z 2022-11-23T02:05:07.9676485Z OK 2022-11-23T02:05:07.9676504Z 2022-11-23T02:05:07.9676628Z Generating XML reports... 2022-11-23T02:05:07.9677072Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015500.xml 2022-11-23T02:05:07.9677443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9677620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9677993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9678188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9678444Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprzgpnnee 2022-11-23T02:05:07.9678713Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprzgpnnee/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9678733Z 2022-11-23T02:05:07.9678839Z Running tests... 2022-11-23T02:05:07.9679103Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9679394Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9679619Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9679883Z This unit test verifies whether the Future object is passed properly. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9680099Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9846 2022-11-23T02:05:07.9680310Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9847 2022-11-23T02:05:07.9680685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9680858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9681236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9681406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9681768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9681940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9682309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9682498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9682806Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7szdh9fz 2022-11-23T02:05:07.9683086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7szdh9fz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9683338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj0pvo94i 2022-11-23T02:05:07.9683606Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj0pvo94i/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9683814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9684042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9684143Z ok (4.035s) 2022-11-23T02:05:07.9684163Z 2022-11-23T02:05:07.9684430Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9684540Z Ran 1 test in 4.035s 2022-11-23T02:05:07.9684560Z 2022-11-23T02:05:07.9684698Z OK 2022-11-23T02:05:07.9684717Z 2022-11-23T02:05:07.9684842Z Generating XML reports... 2022-11-23T02:05:07.9685307Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015509.xml 2022-11-23T02:05:07.9685655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9685831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9686206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9686397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9686649Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkhy98xcu 2022-11-23T02:05:07.9686914Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkhy98xcu/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9686938Z 2022-11-23T02:05:07.9687045Z Running tests... 2022-11-23T02:05:07.9687312Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9687622Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9687834Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9688126Z This unit test verifies whether the Future object is passed properly using gloo backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9688344Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10059 2022-11-23T02:05:07.9688560Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10060 2022-11-23T02:05:07.9688924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9689098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9689482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9689673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9690019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9690193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9690563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9690749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9691004Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4_d05h53 2022-11-23T02:05:07.9691268Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4_d05h53/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9691520Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu378e474 2022-11-23T02:05:07.9691830Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu378e474/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9692063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9692272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9692375Z ok (5.729s) 2022-11-23T02:05:07.9692395Z 2022-11-23T02:05:07.9692660Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9692772Z Ran 1 test in 5.729s 2022-11-23T02:05:07.9692791Z 2022-11-23T02:05:07.9692882Z OK 2022-11-23T02:05:07.9692900Z 2022-11-23T02:05:07.9693022Z Generating XML reports... 2022-11-23T02:05:07.9693482Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015516.xml 2022-11-23T02:05:07.9693845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9694051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9694428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9694619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9694868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyslgnr0v 2022-11-23T02:05:07.9695134Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyslgnr0v/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9695154Z 2022-11-23T02:05:07.9695262Z Running tests... 2022-11-23T02:05:07.9695527Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9695833Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9696052Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9696318Z DDP communication hook can only be registered once. This test validates whether ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9696535Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10274 2022-11-23T02:05:07.9696749Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10275 2022-11-23T02:05:07.9697111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9697284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9697661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9697848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9698204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9698365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9698737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9698924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9699175Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2n_mfbd6 2022-11-23T02:05:07.9699439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2n_mfbd6/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9699668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9699919Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps1zzr1a3 2022-11-23T02:05:07.9700182Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps1zzr1a3/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9700409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9700537Z ok (4.141s) 2022-11-23T02:05:07.9700558Z 2022-11-23T02:05:07.9700831Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9700945Z Ran 1 test in 4.142s 2022-11-23T02:05:07.9700964Z 2022-11-23T02:05:07.9701056Z OK 2022-11-23T02:05:07.9701075Z 2022-11-23T02:05:07.9701198Z Generating XML reports... 2022-11-23T02:05:07.9701653Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015524.xml 2022-11-23T02:05:07.9702018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9702192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9702544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9702786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9703046Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn_mwq6fb 2022-11-23T02:05:07.9703314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn_mwq6fb/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9703334Z 2022-11-23T02:05:07.9703442Z Running tests... 2022-11-23T02:05:07.9703705Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9704327Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9704561Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9704835Z Runs "test_sparse_gradients" unit test with DDP communication hook. We define a ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9705035Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10479 2022-11-23T02:05:07.9705256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10480 2022-11-23T02:05:07.9705635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9705812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9706186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9706374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9706734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9706907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9707261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9707449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9707708Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvcz_48ak 2022-11-23T02:05:07.9707971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvcz_48ak/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9708224Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj752znhk 2022-11-23T02:05:07.9708487Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj752znhk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9708713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9708938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9709038Z ok (4.138s) 2022-11-23T02:05:07.9709058Z 2022-11-23T02:05:07.9709306Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9709417Z Ran 1 test in 4.139s 2022-11-23T02:05:07.9709441Z 2022-11-23T02:05:07.9709531Z OK 2022-11-23T02:05:07.9709550Z 2022-11-23T02:05:07.9709748Z Generating XML reports... 2022-11-23T02:05:07.9710220Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015530.xml 2022-11-23T02:05:07.9710587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9710762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9711134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9711303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9711557Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp559kb1yh 2022-11-23T02:05:07.9711823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp559kb1yh/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9711900Z 2022-11-23T02:05:07.9712014Z Running tests... 2022-11-23T02:05:07.9712283Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9712594Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9712807Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9713080Z This unit test makes sure that register_comm_hook properly checks the format ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9713295Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10754 2022-11-23T02:05:07.9713493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10755 2022-11-23T02:05:07.9713859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9714033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9714415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9714605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9714967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9715139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9715508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9715678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9715926Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpab2_0zck 2022-11-23T02:05:07.9716189Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpab2_0zck/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9716439Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd63qkpjk 2022-11-23T02:05:07.9716708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd63qkpjk/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9716933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9717153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9717253Z ok (4.109s) 2022-11-23T02:05:07.9717272Z 2022-11-23T02:05:07.9717538Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9717631Z Ran 1 test in 4.109s 2022-11-23T02:05:07.9717650Z 2022-11-23T02:05:07.9717741Z OK 2022-11-23T02:05:07.9717760Z 2022-11-23T02:05:07.9717882Z Generating XML reports... 2022-11-23T02:05:07.9718338Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015537.xml 2022-11-23T02:05:07.9718702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9718927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9719311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9719501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9719734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2e78xezj 2022-11-23T02:05:07.9719998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2e78xezj/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9720018Z 2022-11-23T02:05:07.9720125Z Running tests... 2022-11-23T02:05:07.9720388Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9720695Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9720917Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9721246Z This test checks whether return annotation checked properly if defined. It also ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9721463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10959 2022-11-23T02:05:07.9721681Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10960 2022-11-23T02:05:07.9722032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9722209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9722582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9722771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9723130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9723308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9723677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9723863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9724100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw931bwp2 2022-11-23T02:05:07.9724366Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw931bwp2/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9724591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9724841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa_mba20y 2022-11-23T02:05:07.9725106Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa_mba20y/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9725335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9725437Z ok (4.092s) 2022-11-23T02:05:07.9725458Z 2022-11-23T02:05:07.9725725Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9725817Z Ran 1 test in 4.092s 2022-11-23T02:05:07.9725855Z 2022-11-23T02:05:07.9725927Z OK 2022-11-23T02:05:07.9725945Z 2022-11-23T02:05:07.9726067Z Generating XML reports... 2022-11-23T02:05:07.9726528Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015543.xml 2022-11-23T02:05:07.9726892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9727066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9727440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9727633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9727990Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprgp81i_9 2022-11-23T02:05:07.9728248Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprgp81i_9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9728288Z 2022-11-23T02:05:07.9728378Z Running tests... 2022-11-23T02:05:07.9728644Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9728953Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9729208Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9729475Z An empty unused_parameters array does not imply find_unused_parameters = ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9729693Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11172 2022-11-23T02:05:07.9729956Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11173 2022-11-23T02:05:07.9730328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9730484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9730865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9731055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9731422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9731596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9731966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9732154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9732415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8vvhjrqr 2022-11-23T02:05:07.9732666Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8vvhjrqr/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9732893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9733146Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzqcbvfz3 2022-11-23T02:05:07.9733412Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzqcbvfz3/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9733637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9734412Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:07.9734521Z ok (5.737s) 2022-11-23T02:05:07.9734541Z 2022-11-23T02:05:07.9734808Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9734919Z Ran 1 test in 5.738s 2022-11-23T02:05:07.9734938Z 2022-11-23T02:05:07.9735028Z OK 2022-11-23T02:05:07.9735047Z 2022-11-23T02:05:07.9735150Z Generating XML reports... 2022-11-23T02:05:07.9735609Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015550.xml 2022-11-23T02:05:07.9735975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9736153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9736594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9736792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9737046Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgwbks3er 2022-11-23T02:05:07.9737316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgwbks3er/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9737336Z 2022-11-23T02:05:07.9737445Z Running tests... 2022-11-23T02:05:07.9737690Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9738002Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9738287Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9738557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11387 2022-11-23T02:05:07.9738780Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11388 2022-11-23T02:05:07.9739150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9739327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9739704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9739895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9740236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9740409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9740774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9740966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9741220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ot71tgc 2022-11-23T02:05:07.9741487Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ot71tgc/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9741739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrpag6d9 2022-11-23T02:05:07.9742004Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrpag6d9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9742211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9742435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9742535Z ok (5.737s) 2022-11-23T02:05:07.9742554Z 2022-11-23T02:05:07.9742818Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9742931Z Ran 1 test in 5.737s 2022-11-23T02:05:07.9742953Z 2022-11-23T02:05:07.9743044Z OK 2022-11-23T02:05:07.9743063Z 2022-11-23T02:05:07.9743184Z Generating XML reports... 2022-11-23T02:05:07.9743643Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015558.xml 2022-11-23T02:05:07.9744255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9744422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9744803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9744996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9745251Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm9lu6fz5 2022-11-23T02:05:07.9745592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm9lu6fz5/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9745615Z 2022-11-23T02:05:07.9745730Z Running tests... 2022-11-23T02:05:07.9746004Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9746312Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9746599Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9746817Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11602 2022-11-23T02:05:07.9747030Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11603 2022-11-23T02:05:07.9747398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9747572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9748013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9748205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9748572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9748741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9749091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9749281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9749535Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ckazyxv 2022-11-23T02:05:07.9749803Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ckazyxv/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9750035Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9750290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpztkou99f 2022-11-23T02:05:07.9750554Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpztkou99f/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9750782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9750863Z ok (5.869s) 2022-11-23T02:05:07.9750883Z 2022-11-23T02:05:07.9751148Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9751259Z Ran 1 test in 5.869s 2022-11-23T02:05:07.9751278Z 2022-11-23T02:05:07.9751367Z OK 2022-11-23T02:05:07.9751385Z 2022-11-23T02:05:07.9751507Z Generating XML reports... 2022-11-23T02:05:07.9751961Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015606.xml 2022-11-23T02:05:07.9752331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9752505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9752877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9753048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9753302Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpomi6s30g 2022-11-23T02:05:07.9753567Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpomi6s30g/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9753587Z 2022-11-23T02:05:07.9753693Z Running tests... 2022-11-23T02:05:07.9753953Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9754345Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9754719Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9754941Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11817 2022-11-23T02:05:07.9755139Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11818 2022-11-23T02:05:07.9755517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9755690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9756066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9756254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9756615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9756838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9757213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9757401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9757637Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgzamf9lt 2022-11-23T02:05:07.9757906Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgzamf9lt/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9758162Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpapbn4ram 2022-11-23T02:05:07.9758431Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpapbn4ram/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9758654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9758879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9759789Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9759902Z warnings.warn( 2022-11-23T02:05:07.9760801Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:05:07.9760912Z warnings.warn( 2022-11-23T02:05:07.9760993Z ok (5.715s) 2022-11-23T02:05:07.9761016Z 2022-11-23T02:05:07.9761279Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9761391Z Ran 1 test in 5.715s 2022-11-23T02:05:07.9761410Z 2022-11-23T02:05:07.9761502Z OK 2022-11-23T02:05:07.9761520Z 2022-11-23T02:05:07.9761644Z Generating XML reports... 2022-11-23T02:05:07.9762101Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015615.xml 2022-11-23T02:05:07.9762467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9762641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9763018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9763189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9763444Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5mepg14b 2022-11-23T02:05:07.9763761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5mepg14b/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9763783Z 2022-11-23T02:05:07.9763895Z Running tests... 2022-11-23T02:05:07.9764163Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9764470Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9764773Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9764992Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12032 2022-11-23T02:05:07.9765192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12033 2022-11-23T02:05:07.9765668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9765897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9766282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9766474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9766838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9767016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9767390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9767579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9767816Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps1zqyk1o 2022-11-23T02:05:07.9768084Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps1zqyk1o/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9768341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqhgfeznt 2022-11-23T02:05:07.9768607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqhgfeznt/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9768831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9769054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9769286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9769516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9769599Z ok (6.220s) 2022-11-23T02:05:07.9769618Z 2022-11-23T02:05:07.9769886Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9769999Z Ran 1 test in 6.220s 2022-11-23T02:05:07.9770019Z 2022-11-23T02:05:07.9770112Z OK 2022-11-23T02:05:07.9770131Z 2022-11-23T02:05:07.9770251Z Generating XML reports... 2022-11-23T02:05:07.9770711Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015623.xml 2022-11-23T02:05:07.9771078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9771250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9771622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9771793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9772043Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfa89bup4 2022-11-23T02:05:07.9772314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfa89bup4/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9772337Z 2022-11-23T02:05:07.9772444Z Running tests... 2022-11-23T02:05:07.9772752Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9773068Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9773379Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9773598Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12247 2022-11-23T02:05:07.9773795Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12248 2022-11-23T02:05:07.9774161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9774335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9774703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9774940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9775302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9775471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9775835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9776019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9776256Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx4w1cqdi 2022-11-23T02:05:07.9776523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx4w1cqdi/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9776772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuv_vxgue 2022-11-23T02:05:07.9777038Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuv_vxgue/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9777265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9777483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9777714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9777944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9778026Z ok (6.237s) 2022-11-23T02:05:07.9778046Z 2022-11-23T02:05:07.9778308Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9778418Z Ran 1 test in 6.237s 2022-11-23T02:05:07.9778438Z 2022-11-23T02:05:07.9778527Z OK 2022-11-23T02:05:07.9778546Z 2022-11-23T02:05:07.9778666Z Generating XML reports... 2022-11-23T02:05:07.9779123Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015632.xml 2022-11-23T02:05:07.9779503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9779682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9780063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9780234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9780487Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp342x5dgn 2022-11-23T02:05:07.9780748Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp342x5dgn/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9780768Z 2022-11-23T02:05:07.9780873Z Running tests... 2022-11-23T02:05:07.9781136Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9781445Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9781765Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9781990Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12462 2022-11-23T02:05:07.9782187Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12463 2022-11-23T02:05:07.9782564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9782745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9783126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9783325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9783689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9784168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9784559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9784751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9784989Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4xgbojuf 2022-11-23T02:05:07.9785258Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4xgbojuf/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9785482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9785735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4gkfc68k 2022-11-23T02:05:07.9785997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4gkfc68k/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9786230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9786582Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:05:07.9786922Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:05:07.9787138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9787374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9787475Z ok (7.951s) 2022-11-23T02:05:07.9787494Z 2022-11-23T02:05:07.9787761Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9787879Z Ran 1 test in 7.951s 2022-11-23T02:05:07.9787899Z 2022-11-23T02:05:07.9787994Z OK 2022-11-23T02:05:07.9788013Z 2022-11-23T02:05:07.9788135Z Generating XML reports... 2022-11-23T02:05:07.9788597Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015640.xml 2022-11-23T02:05:07.9788968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9789125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9789497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9789689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9789939Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsk5z99eb 2022-11-23T02:05:07.9790202Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsk5z99eb/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9790222Z 2022-11-23T02:05:07.9790334Z Running tests... 2022-11-23T02:05:07.9790600Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9790986Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9791250Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9791467Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12679 2022-11-23T02:05:07.9791684Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12680 2022-11-23T02:05:07.9792053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9792232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9792614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9792809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9793247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9793430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9793780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9793973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9794231Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk74dl1zm 2022-11-23T02:05:07.9794506Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk74dl1zm/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9794764Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvspv0b8a 2022-11-23T02:05:07.9795029Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvspv0b8a/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9795259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9795500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9795630Z skip: Need at least 8 CUDA devices (4.045s) 2022-11-23T02:05:07.9795675Z 2022-11-23T02:05:07.9795923Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9796041Z Ran 1 test in 4.046s 2022-11-23T02:05:07.9796061Z 2022-11-23T02:05:07.9796173Z OK (skipped=1) 2022-11-23T02:05:07.9796192Z 2022-11-23T02:05:07.9796317Z Generating XML reports... 2022-11-23T02:05:07.9796775Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015651.xml 2022-11-23T02:05:07.9797142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9797314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9797690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9797867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9798119Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9ohcpzlv 2022-11-23T02:05:07.9798390Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9ohcpzlv/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9798409Z 2022-11-23T02:05:07.9798515Z Running tests... 2022-11-23T02:05:07.9798779Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9799086Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9799355Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9799568Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12878 2022-11-23T02:05:07.9799767Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12879 2022-11-23T02:05:07.9800243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9800426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9800807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9800995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9801353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9801527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9801893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9802079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9802373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6zuci09u 2022-11-23T02:05:07.9802639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6zuci09u/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9802886Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1l5mwcrf 2022-11-23T02:05:07.9803153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1l5mwcrf/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9803378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9803604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9803838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9804064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9804150Z ok (4.133s) 2022-11-23T02:05:07.9804187Z 2022-11-23T02:05:07.9804441Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9804555Z Ran 1 test in 4.133s 2022-11-23T02:05:07.9804574Z 2022-11-23T02:05:07.9804665Z OK 2022-11-23T02:05:07.9804684Z 2022-11-23T02:05:07.9804808Z Generating XML reports... 2022-11-23T02:05:07.9805264Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015657.xml 2022-11-23T02:05:07.9805629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9805802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9806172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9806342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9806601Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptvd4ghmb 2022-11-23T02:05:07.9806870Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptvd4ghmb/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9806890Z 2022-11-23T02:05:07.9806997Z Running tests... 2022-11-23T02:05:07.9807265Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9807570Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9807857Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9808070Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13091 2022-11-23T02:05:07.9808286Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13092 2022-11-23T02:05:07.9808632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9808858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9809241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9809430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9809784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9809955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9810323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9810507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9810742Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy96k6cbo 2022-11-23T02:05:07.9811007Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy96k6cbo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9811333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9e0rqoa_ 2022-11-23T02:05:07.9811597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9e0rqoa_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9811826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9812052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9812285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9812511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9812594Z ok (4.117s) 2022-11-23T02:05:07.9812632Z 2022-11-23T02:05:07.9812880Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9812993Z Ran 1 test in 4.117s 2022-11-23T02:05:07.9813015Z 2022-11-23T02:05:07.9813106Z OK 2022-11-23T02:05:07.9813125Z 2022-11-23T02:05:07.9813255Z Generating XML reports... 2022-11-23T02:05:07.9813718Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015704.xml 2022-11-23T02:05:07.9814085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9814259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9814632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9814802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9815053Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj1f2oxtp 2022-11-23T02:05:07.9815316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj1f2oxtp/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9815340Z 2022-11-23T02:05:07.9815449Z Running tests... 2022-11-23T02:05:07.9815714Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9816019Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9816213Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9816465Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9816662Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13304 2022-11-23T02:05:07.9816874Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13305 2022-11-23T02:05:07.9817238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9817410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9817832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9818025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9818387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9818557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9818931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9819101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9819352Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfkzeccov 2022-11-23T02:05:07.9819619Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfkzeccov/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9819872Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl2gxcp20 2022-11-23T02:05:07.9820406Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl2gxcp20/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9820636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9820862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9821096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9821313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9821411Z ok (4.067s) 2022-11-23T02:05:07.9821430Z 2022-11-23T02:05:07.9821696Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9821806Z Ran 1 test in 4.067s 2022-11-23T02:05:07.9821826Z 2022-11-23T02:05:07.9821916Z OK 2022-11-23T02:05:07.9821935Z 2022-11-23T02:05:07.9822062Z Generating XML reports... 2022-11-23T02:05:07.9822528Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015710.xml 2022-11-23T02:05:07.9822895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9823072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9823433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9823623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9824128Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpljw_g59j 2022-11-23T02:05:07.9824407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpljw_g59j/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9824427Z 2022-11-23T02:05:07.9824537Z Running tests... 2022-11-23T02:05:07.9824813Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9825127Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9825361Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:05:07.9825595Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9825813Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13579 2022-11-23T02:05:07.9826029Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13580 2022-11-23T02:05:07.9826399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9826572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9826948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9827214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9827591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9827767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9828119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9828308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9828561Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplrjq_j99 2022-11-23T02:05:07.9828829Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplrjq_j99/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9829081Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmryzq2os 2022-11-23T02:05:07.9829347Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmryzq2os/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9829636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9829858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9829940Z ok (4.162s) 2022-11-23T02:05:07.9829981Z 2022-11-23T02:05:07.9830232Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9830343Z Ran 1 test in 4.162s 2022-11-23T02:05:07.9830362Z 2022-11-23T02:05:07.9830454Z OK 2022-11-23T02:05:07.9830473Z 2022-11-23T02:05:07.9830599Z Generating XML reports... 2022-11-23T02:05:07.9831054Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015717.xml 2022-11-23T02:05:07.9831419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9831597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9831975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9832145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9832395Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnf2mm205 2022-11-23T02:05:07.9832659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnf2mm205/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9832679Z 2022-11-23T02:05:07.9832785Z Running tests... 2022-11-23T02:05:07.9833048Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9833355Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9833629Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9833850Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13854 2022-11-23T02:05:07.9834051Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13855 2022-11-23T02:05:07.9834420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9834593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9834966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9835154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9835514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9835688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9836056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9836289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9836531Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr3bnp7h2 2022-11-23T02:05:07.9836800Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr3bnp7h2/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9837051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_xoly43v 2022-11-23T02:05:07.9837313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_xoly43v/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9837540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9837766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9838009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9838306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9838687Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9839080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9839186Z ok (5.725s) 2022-11-23T02:05:07.9839207Z 2022-11-23T02:05:07.9839468Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9839575Z Ran 1 test in 5.726s 2022-11-23T02:05:07.9839595Z 2022-11-23T02:05:07.9839685Z OK 2022-11-23T02:05:07.9839704Z 2022-11-23T02:05:07.9839827Z Generating XML reports... 2022-11-23T02:05:07.9840283Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015724.xml 2022-11-23T02:05:07.9840650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9840813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9841188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9841378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9841633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjgvwkfic 2022-11-23T02:05:07.9841903Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjgvwkfic/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9841923Z 2022-11-23T02:05:07.9842030Z Running tests... 2022-11-23T02:05:07.9842297Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9842607Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9842883Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9843089Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14061 2022-11-23T02:05:07.9843306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14062 2022-11-23T02:05:07.9843676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9843848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9844225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9844415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9844776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9844951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9845352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9845546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9845800Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa6l_yt53 2022-11-23T02:05:07.9846067Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa6l_yt53/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9846322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyu8pwvs2 2022-11-23T02:05:07.9846588Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyu8pwvs2/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9846812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9847359Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9847949Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9848487Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9849023Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9849553Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9850087Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9850322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9850848Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9851374Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9851938Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9852482Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9853009Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9853533Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:05:07.9853681Z ok (4.009s) 2022-11-23T02:05:07.9853702Z 2022-11-23T02:05:07.9853955Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9854068Z Ran 1 test in 4.009s 2022-11-23T02:05:07.9854087Z 2022-11-23T02:05:07.9854182Z OK 2022-11-23T02:05:07.9854201Z 2022-11-23T02:05:07.9854329Z Generating XML reports... 2022-11-23T02:05:07.9854790Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015732.xml 2022-11-23T02:05:07.9855160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9855335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9855715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9855905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9856142Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp358ikrp3 2022-11-23T02:05:07.9856409Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp358ikrp3/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9856429Z 2022-11-23T02:05:07.9856535Z Running tests... 2022-11-23T02:05:07.9856799Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9857104Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9857372Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9857588Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14260 2022-11-23T02:05:07.9857809Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14261 2022-11-23T02:05:07.9858157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9858330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9858705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9858894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9859252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9859424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9859797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9860048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9860293Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3q6401r1 2022-11-23T02:05:07.9860560Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3q6401r1/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9860812Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuzbeu_jc 2022-11-23T02:05:07.9861080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuzbeu_jc/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9861307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9861534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9861777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9862065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9862472Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9862848Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:05:07.9863086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9863322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9863554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9863783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9864116Z ok (7.346s) 2022-11-23T02:05:07.9864138Z 2022-11-23T02:05:07.9864417Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9864534Z Ran 1 test in 7.346s 2022-11-23T02:05:07.9864554Z 2022-11-23T02:05:07.9864647Z OK 2022-11-23T02:05:07.9864666Z 2022-11-23T02:05:07.9864772Z Generating XML reports... 2022-11-23T02:05:07.9865236Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015738.xml 2022-11-23T02:05:07.9865700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9865878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9866256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9866447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9866700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprrs3lgrz 2022-11-23T02:05:07.9867090Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprrs3lgrz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9867116Z 2022-11-23T02:05:07.9867222Z Running tests... 2022-11-23T02:05:07.9867470Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9867780Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9868043Z test_sparse_gradients (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9868259Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14476 2022-11-23T02:05:07.9868472Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14477 2022-11-23T02:05:07.9868836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9869010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9869385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9869638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9870011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9870188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9870558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9870745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9870998Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi467zjdo 2022-11-23T02:05:07.9871262Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi467zjdo/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9871546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp41smadsh 2022-11-23T02:05:07.9871820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9872085Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp41smadsh/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9872310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9872412Z ok (4.111s) 2022-11-23T02:05:07.9872432Z 2022-11-23T02:05:07.9872700Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9872810Z Ran 1 test in 4.111s 2022-11-23T02:05:07.9872830Z 2022-11-23T02:05:07.9872920Z OK 2022-11-23T02:05:07.9872939Z 2022-11-23T02:05:07.9873062Z Generating XML reports... 2022-11-23T02:05:07.9873520Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015748.xml 2022-11-23T02:05:07.9873869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9874051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9874421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9874610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9874865Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy5j1ijy_ 2022-11-23T02:05:07.9875131Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy5j1ijy_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9875151Z 2022-11-23T02:05:07.9875258Z Running tests... 2022-11-23T02:05:07.9875518Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9875807Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9876089Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9876313Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14751 2022-11-23T02:05:07.9876528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14752 2022-11-23T02:05:07.9876898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9877070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9877440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9877630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9877991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9878144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9878558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9878754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9879012Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5u5ijyjz 2022-11-23T02:05:07.9879282Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5u5ijyjz/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9879532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvw2b0j_9 2022-11-23T02:05:07.9879799Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvw2b0j_9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9880025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9880228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9880328Z ok (4.129s) 2022-11-23T02:05:07.9880390Z 2022-11-23T02:05:07.9880665Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9880778Z Ran 1 test in 4.130s 2022-11-23T02:05:07.9880797Z 2022-11-23T02:05:07.9880887Z OK 2022-11-23T02:05:07.9880906Z 2022-11-23T02:05:07.9881030Z Generating XML reports... 2022-11-23T02:05:07.9881489Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015755.xml 2022-11-23T02:05:07.9881854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9882028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9882385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9882573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9882828Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyqvkx59u 2022-11-23T02:05:07.9883100Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyqvkx59u/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9883120Z 2022-11-23T02:05:07.9883228Z Running tests... 2022-11-23T02:05:07.9883492Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9883798Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9884074Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9884271Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15026 2022-11-23T02:05:07.9884488Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15027 2022-11-23T02:05:07.9884851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9885028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9885404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9885592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9885959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9886131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9886502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9886671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9886925Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbnappkya 2022-11-23T02:05:07.9887194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbnappkya/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9887494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkuvxpers 2022-11-23T02:05:07.9887768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkuvxpers/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9887996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9888223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9888458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9888671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9888778Z ok (7.236s) 2022-11-23T02:05:07.9888797Z 2022-11-23T02:05:07.9889070Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9889180Z Ran 1 test in 7.237s 2022-11-23T02:05:07.9889199Z 2022-11-23T02:05:07.9889336Z OK 2022-11-23T02:05:07.9889355Z 2022-11-23T02:05:07.9889480Z Generating XML reports... 2022-11-23T02:05:07.9889941Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015801.xml 2022-11-23T02:05:07.9890309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9890483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9890843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9891034Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9891288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw1k5tcq_ 2022-11-23T02:05:07.9891553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw1k5tcq_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9891576Z 2022-11-23T02:05:07.9891683Z Running tests... 2022-11-23T02:05:07.9891950Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9892260Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9892540Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9892737Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15241 2022-11-23T02:05:07.9892954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15242 2022-11-23T02:05:07.9893318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9893496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9893873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9894066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9894432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9894604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9894971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9895141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9895393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ljq6f1d 2022-11-23T02:05:07.9895659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ljq6f1d/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9895885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9896135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplbfl9n50 2022-11-23T02:05:07.9896454Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplbfl9n50/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9896688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9896923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9897135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:05:07.9897238Z ok (6.551s) 2022-11-23T02:05:07.9897260Z 2022-11-23T02:05:07.9897529Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9897640Z Ran 1 test in 6.551s 2022-11-23T02:05:07.9897659Z 2022-11-23T02:05:07.9897750Z OK 2022-11-23T02:05:07.9897768Z 2022-11-23T02:05:07.9897894Z Generating XML reports... 2022-11-23T02:05:07.9898359Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015811.xml 2022-11-23T02:05:07.9898813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9898995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9899351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9899547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9899803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_59xo1rc 2022-11-23T02:05:07.9900071Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_59xo1rc/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9900091Z 2022-11-23T02:05:07.9900199Z Running tests... 2022-11-23T02:05:07.9900464Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9900771Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9901110Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9901308Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15456 2022-11-23T02:05:07.9901676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9901858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9902232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9902425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9902675Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_hzndrl 2022-11-23T02:05:07.9902940Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_hzndrl/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9903171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9903414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9903796Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:05:07.9904811Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9905052Z warnings.warn( 2022-11-23T02:05:07.9905152Z ok (4.069s) 2022-11-23T02:05:07.9905174Z 2022-11-23T02:05:07.9905440Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9905553Z Ran 1 test in 4.069s 2022-11-23T02:05:07.9905577Z 2022-11-23T02:05:07.9905669Z OK 2022-11-23T02:05:07.9905688Z 2022-11-23T02:05:07.9905880Z Generating XML reports... 2022-11-23T02:05:07.9906440Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015820.xml 2022-11-23T02:05:07.9906792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9906969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9907353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9907546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9907804Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2wpktisl 2022-11-23T02:05:07.9908073Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2wpktisl/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9908150Z 2022-11-23T02:05:07.9908265Z Running tests... 2022-11-23T02:05:07.9908532Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9908844Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9909161Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9909380Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15592 2022-11-23T02:05:07.9909744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9909918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9910291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9910486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9910746Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl2tql39k 2022-11-23T02:05:07.9911011Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl2tql39k/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9911220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9911468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9911872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:05:07.9912615Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9912735Z warnings.warn( 2022-11-23T02:05:07.9912838Z ok (4.037s) 2022-11-23T02:05:07.9912863Z 2022-11-23T02:05:07.9913128Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9913241Z Ran 1 test in 4.037s 2022-11-23T02:05:07.9913260Z 2022-11-23T02:05:07.9913355Z OK 2022-11-23T02:05:07.9913374Z 2022-11-23T02:05:07.9913480Z Generating XML reports... 2022-11-23T02:05:07.9914026Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015826.xml 2022-11-23T02:05:07.9914391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9914564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9914936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9915132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9915441Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzp39p0sn 2022-11-23T02:05:07.9915718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzp39p0sn/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9915738Z 2022-11-23T02:05:07.9915847Z Running tests... 2022-11-23T02:05:07.9916093Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9916403Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9916726Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9916945Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15728 2022-11-23T02:05:07.9917308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9917536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9917924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9918124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9918360Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu142qs3a 2022-11-23T02:05:07.9918626Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu142qs3a/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9918852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9919096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9919494Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:05:07.9919596Z ok (4.030s) 2022-11-23T02:05:07.9919619Z 2022-11-23T02:05:07.9919882Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9919994Z Ran 1 test in 4.030s 2022-11-23T02:05:07.9920013Z 2022-11-23T02:05:07.9920103Z OK 2022-11-23T02:05:07.9920122Z 2022-11-23T02:05:07.9920226Z Generating XML reports... 2022-11-23T02:05:07.9920789Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015833.xml 2022-11-23T02:05:07.9921154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9921328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9921699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9921888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9922145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_aq2daqx 2022-11-23T02:05:07.9922414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_aq2daqx/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9922434Z 2022-11-23T02:05:07.9922541Z Running tests... 2022-11-23T02:05:07.9922786Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9923094Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9923423Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9923651Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15864 2022-11-23T02:05:07.9924012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9924191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9924640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9924845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9925079Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqcjy2eyi 2022-11-23T02:05:07.9925351Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqcjy2eyi/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9925578Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9925822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9926223Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:05:07.9926325Z ok (4.057s) 2022-11-23T02:05:07.9926345Z 2022-11-23T02:05:07.9926607Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9926766Z Ran 1 test in 4.057s 2022-11-23T02:05:07.9926789Z 2022-11-23T02:05:07.9926880Z OK 2022-11-23T02:05:07.9926899Z 2022-11-23T02:05:07.9927002Z Generating XML reports... 2022-11-23T02:05:07.9927553Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015840.xml 2022-11-23T02:05:07.9927919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9928093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9928469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9928657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9928910Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiu3rs6gj 2022-11-23T02:05:07.9929187Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiu3rs6gj/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9929206Z 2022-11-23T02:05:07.9929313Z Running tests... 2022-11-23T02:05:07.9929559Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9929865Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9930111Z test_allgather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9930326Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16000 2022-11-23T02:05:07.9930546Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16001 2022-11-23T02:05:07.9930758Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16002 2022-11-23T02:05:07.9930966Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16003 2022-11-23T02:05:07.9931343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9931500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9931873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9932062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9932420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9932590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9932959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9933146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9933502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9933707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9934083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9934271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9934634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9934804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9935164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9935350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9935604Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz0vmwvda 2022-11-23T02:05:07.9935965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz0vmwvda/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9936199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0x5wbqve 2022-11-23T02:05:07.9936464Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0x5wbqve/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9936712Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdwwac5c5 2022-11-23T02:05:07.9936976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdwwac5c5/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9937206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:07.9937431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9937654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:07.9937907Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw2xk72at 2022-11-23T02:05:07.9938155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw2xk72at/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9938375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9938476Z ok (4.238s) 2022-11-23T02:05:07.9938496Z 2022-11-23T02:05:07.9938765Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9938877Z Ran 1 test in 4.238s 2022-11-23T02:05:07.9938896Z 2022-11-23T02:05:07.9938986Z OK 2022-11-23T02:05:07.9939005Z 2022-11-23T02:05:07.9939127Z Generating XML reports... 2022-11-23T02:05:07.9939556Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015846.xml 2022-11-23T02:05:07.9939924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9940084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9940462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9940652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9940903Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyy9nb8lp 2022-11-23T02:05:07.9941168Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyy9nb8lp/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9941188Z 2022-11-23T02:05:07.9941294Z Running tests... 2022-11-23T02:05:07.9941562Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9941876Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9942113Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9942336Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16343 2022-11-23T02:05:07.9942595Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16344 2022-11-23T02:05:07.9942813Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16345 2022-11-23T02:05:07.9943023Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16346 2022-11-23T02:05:07.9943391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9943567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9944178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9944381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9944732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9944994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9945367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9945558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9945915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9946089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9946461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9946655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9947001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9947181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9947555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9947748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9948004Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp451eno9_ 2022-11-23T02:05:07.9948276Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp451eno9_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9948507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:07.9948759Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgda3jsd7 2022-11-23T02:05:07.9949025Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgda3jsd7/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9949258Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphn33iwqq 2022-11-23T02:05:07.9949531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphn33iwqq/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9949755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9950005Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuzbljetg 2022-11-23T02:05:07.9950268Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuzbljetg/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9950493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:07.9950713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9950813Z ok (6.130s) 2022-11-23T02:05:07.9950832Z 2022-11-23T02:05:07.9951080Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9951190Z Ran 1 test in 6.131s 2022-11-23T02:05:07.9951213Z 2022-11-23T02:05:07.9951305Z OK 2022-11-23T02:05:07.9951323Z 2022-11-23T02:05:07.9951502Z Generating XML reports... 2022-11-23T02:05:07.9951940Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015853.xml 2022-11-23T02:05:07.9952310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9952490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9952871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9953067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9953304Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwlml8p4q 2022-11-23T02:05:07.9953585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwlml8p4q/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9953653Z 2022-11-23T02:05:07.9953766Z Running tests... 2022-11-23T02:05:07.9954036Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9954345Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9954591Z test_allgather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9954811Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16690 2022-11-23T02:05:07.9955027Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16691 2022-11-23T02:05:07.9955222Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16692 2022-11-23T02:05:07.9955433Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16693 2022-11-23T02:05:07.9955799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9955978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9956356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9956545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9956903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9957079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9957448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9957623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9957977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9958149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9958523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9958709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9959069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9959243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9959615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9959784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9960039Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5uhv0ipu 2022-11-23T02:05:07.9960308Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5uhv0ipu/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9960538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:07.9960833Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbx7sx23o 2022-11-23T02:05:07.9961105Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbx7sx23o/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9961333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:07.9961584Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpizub8gz9 2022-11-23T02:05:07.9961847Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpizub8gz9/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9962077Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplo5mjkoq 2022-11-23T02:05:07.9962339Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplo5mjkoq/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9962561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9962838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9962940Z ok (4.215s) 2022-11-23T02:05:07.9962961Z 2022-11-23T02:05:07.9963227Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9963339Z Ran 1 test in 4.216s 2022-11-23T02:05:07.9963360Z 2022-11-23T02:05:07.9963448Z OK 2022-11-23T02:05:07.9963467Z 2022-11-23T02:05:07.9963571Z Generating XML reports... 2022-11-23T02:05:07.9964075Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015901.xml 2022-11-23T02:05:07.9964446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9964621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9964998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9965193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9965483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa6ecpwy2 2022-11-23T02:05:07.9965794Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa6ecpwy2/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9965814Z 2022-11-23T02:05:07.9965922Z Running tests... 2022-11-23T02:05:07.9966174Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9966484Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9966747Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9966963Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17033 2022-11-23T02:05:07.9967176Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17034 2022-11-23T02:05:07.9967396Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17035 2022-11-23T02:05:07.9967605Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17036 2022-11-23T02:05:07.9967972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9968127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9968499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9968690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9969055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9969231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9969657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9969860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9970226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9970398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9970743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9970931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9971290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9971461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9971827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9972067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9972322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0qyrvwop 2022-11-23T02:05:07.9972597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0qyrvwop/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9972830Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx6tqarn8 2022-11-23T02:05:07.9973084Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6chpop1p 2022-11-23T02:05:07.9973351Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx6tqarn8/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9973616Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6chpop1p/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9973862Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm_4dhs5u 2022-11-23T02:05:07.9974127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm_4dhs5u/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9974354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9974580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:07.9974804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:07.9975008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9975247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:07.9975488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:07.9975726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:05:07.9975972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:05:07.9976377Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:07.9976772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:07.9977162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:07.9977544Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:07.9978266Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9978383Z warnings.warn( 2022-11-23T02:05:07.9979159Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9979274Z warnings.warn( 2022-11-23T02:05:07.9980001Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9980109Z warnings.warn( 2022-11-23T02:05:07.9980825Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9980990Z warnings.warn( 2022-11-23T02:05:07.9981098Z ok (4.229s) 2022-11-23T02:05:07.9981118Z 2022-11-23T02:05:07.9981390Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9981484Z Ran 1 test in 4.230s 2022-11-23T02:05:07.9981526Z 2022-11-23T02:05:07.9981598Z OK 2022-11-23T02:05:07.9981616Z 2022-11-23T02:05:07.9981740Z Generating XML reports... 2022-11-23T02:05:07.9982169Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015908.xml 2022-11-23T02:05:07.9982532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9982707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9983085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9983283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9983535Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi9yrlu6s 2022-11-23T02:05:07.9983785Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi9yrlu6s/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9983805Z 2022-11-23T02:05:07.9984153Z Running tests... 2022-11-23T02:05:07.9984431Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9984741Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:07.9985006Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:07.9985223Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17376 2022-11-23T02:05:07.9985439Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17377 2022-11-23T02:05:07.9985709Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17378 2022-11-23T02:05:07.9985903Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17379 2022-11-23T02:05:07.9986279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9986459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9986842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9987037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9987399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9987578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9987952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9988230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9988583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9988760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9989137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9989328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9989697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9989874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9990252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9990511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9990749Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp88tu8gd_ 2022-11-23T02:05:07.9991024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp88tu8gd_/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9991282Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa3gw2cwv 2022-11-23T02:05:07.9991553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa3gw2cwv/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9991788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:07.9992016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:07.9992274Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpujks_zj5 2022-11-23T02:05:07.9992550Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpujks_zj5/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9992807Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_866n3jp 2022-11-23T02:05:07.9993049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_866n3jp/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9993277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:07.9993503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:07.9994244Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9994358Z warnings.warn( 2022-11-23T02:05:07.9995090Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9995203Z warnings.warn( 2022-11-23T02:05:07.9995923Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9996032Z warnings.warn( 2022-11-23T02:05:07.9996744Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:07.9996852Z warnings.warn( 2022-11-23T02:05:07.9996938Z ok (4.134s) 2022-11-23T02:05:07.9996958Z 2022-11-23T02:05:07.9997273Z ---------------------------------------------------------------------- 2022-11-23T02:05:07.9997391Z Ran 1 test in 4.134s 2022-11-23T02:05:07.9997411Z 2022-11-23T02:05:07.9997502Z OK 2022-11-23T02:05:07.9997521Z 2022-11-23T02:05:07.9997648Z Generating XML reports... 2022-11-23T02:05:07.9998080Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015915.xml 2022-11-23T02:05:07.9998448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:07.9998622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:07.9998978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:07.9999169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:07.9999470Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr4oceujq 2022-11-23T02:05:07.9999742Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr4oceujq/_remote_module_non_scriptable.py 2022-11-23T02:05:07.9999762Z 2022-11-23T02:05:07.9999872Z Running tests... 2022-11-23T02:05:08.0000138Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0000446Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0000714Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0000912Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17719 2022-11-23T02:05:08.0001129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17720 2022-11-23T02:05:08.0001346Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17721 2022-11-23T02:05:08.0001562Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17722 2022-11-23T02:05:08.0001932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0002108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0002485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0002674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0003038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0003192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0003559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0003745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0004112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0004285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0004650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0004836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0005193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0005347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0005719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0005904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0006208Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0swx3vpv 2022-11-23T02:05:08.0006482Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0swx3vpv/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0006711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0006967Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2ixma8td 2022-11-23T02:05:08.0007232Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2ixma8td/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0007458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0007690Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5kr_2ysn 2022-11-23T02:05:08.0007952Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5kr_2ysn/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0008204Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0co2s0s_ 2022-11-23T02:05:08.0008517Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0co2s0s_/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0008739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0008965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0009067Z ok (4.230s) 2022-11-23T02:05:08.0009087Z 2022-11-23T02:05:08.0009361Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0009454Z Ran 1 test in 4.230s 2022-11-23T02:05:08.0009490Z 2022-11-23T02:05:08.0009562Z OK 2022-11-23T02:05:08.0009581Z 2022-11-23T02:05:08.0009703Z Generating XML reports... 2022-11-23T02:05:08.0010133Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015921.xml 2022-11-23T02:05:08.0010500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0010682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0011061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0011251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0011503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpurnymo7f 2022-11-23T02:05:08.0011752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpurnymo7f/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0011772Z 2022-11-23T02:05:08.0011880Z Running tests... 2022-11-23T02:05:08.0012150Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0012460Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0012708Z test_allgather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0012933Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18062 2022-11-23T02:05:08.0013152Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18063 2022-11-23T02:05:08.0013364Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18064 2022-11-23T02:05:08.0013558Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18065 2022-11-23T02:05:08.0013924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0014097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0014471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0014662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0015068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0015248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0015621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0015808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0016146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0016318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0016690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0016874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0017229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0017450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0017822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0018013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0018249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw7_jvga8 2022-11-23T02:05:08.0018514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw7_jvga8/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0018742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0018997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwjno58ub 2022-11-23T02:05:08.0019262Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwjno58ub/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0019520Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpke9cypsy 2022-11-23T02:05:08.0019784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpke9cypsy/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0020011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0020260Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1bfi251b 2022-11-23T02:05:08.0020506Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1bfi251b/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0020728Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0020951Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0021053Z ok (4.637s) 2022-11-23T02:05:08.0021073Z 2022-11-23T02:05:08.0021344Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0021460Z Ran 1 test in 4.637s 2022-11-23T02:05:08.0021483Z 2022-11-23T02:05:08.0021575Z OK 2022-11-23T02:05:08.0021593Z 2022-11-23T02:05:08.0021716Z Generating XML reports... 2022-11-23T02:05:08.0022128Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015928.xml 2022-11-23T02:05:08.0022497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0022671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0023045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0023232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0023483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzge713je 2022-11-23T02:05:08.0023753Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzge713je/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0023822Z 2022-11-23T02:05:08.0024175Z Running tests... 2022-11-23T02:05:08.0024449Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0024741Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0024997Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0025215Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18429 2022-11-23T02:05:08.0025430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18430 2022-11-23T02:05:08.0025646Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18431 2022-11-23T02:05:08.0025857Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18432 2022-11-23T02:05:08.0026230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0026484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0026843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0027033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0027395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0027569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0027943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0028131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0028486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0028667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0029043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0029210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0029576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0029749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0030116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0030305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0030559Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9e5b5qqx 2022-11-23T02:05:08.0030830Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9e5b5qqx/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0031086Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj5cm3fkb 2022-11-23T02:05:08.0031332Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj5cm3fkb/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0031559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0031786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0032034Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw_9s92ul 2022-11-23T02:05:08.0032292Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw_9s92ul/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0032517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0032767Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptwurvxbg 2022-11-23T02:05:08.0033096Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptwurvxbg/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0033331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0033414Z ok (7.542s) 2022-11-23T02:05:08.0033435Z 2022-11-23T02:05:08.0033704Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0033815Z Ran 1 test in 7.542s 2022-11-23T02:05:08.0033834Z 2022-11-23T02:05:08.0033926Z OK 2022-11-23T02:05:08.0033944Z 2022-11-23T02:05:08.0034068Z Generating XML reports... 2022-11-23T02:05:08.0034499Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015935.xml 2022-11-23T02:05:08.0034864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0035037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0035448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0035643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0035895Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkr1dt4oi 2022-11-23T02:05:08.0036158Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkr1dt4oi/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0036177Z 2022-11-23T02:05:08.0036285Z Running tests... 2022-11-23T02:05:08.0036550Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0036860Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0037104Z test_allreduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0037322Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18800 2022-11-23T02:05:08.0037527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18801 2022-11-23T02:05:08.0037742Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18802 2022-11-23T02:05:08.0037949Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18803 2022-11-23T02:05:08.0038316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0038489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0038864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0039054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0039413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0039570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0039944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0040136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0040494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0040664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0041031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0041217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0041579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0041752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0042152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0042351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0042609Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0zck9qxp 2022-11-23T02:05:08.0042879Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0zck9qxp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0043129Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf_ijjy_f 2022-11-23T02:05:08.0043392Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf_ijjy_f/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0043620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0043848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0044129Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqq9gequw 2022-11-23T02:05:08.0044393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqq9gequw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0044620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0044868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqn_gehw6 2022-11-23T02:05:08.0045129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqn_gehw6/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0045353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0045453Z ok (4.219s) 2022-11-23T02:05:08.0045474Z 2022-11-23T02:05:08.0045745Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0045839Z Ran 1 test in 4.219s 2022-11-23T02:05:08.0045877Z 2022-11-23T02:05:08.0045953Z OK 2022-11-23T02:05:08.0045972Z 2022-11-23T02:05:08.0046095Z Generating XML reports... 2022-11-23T02:05:08.0046529Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015945.xml 2022-11-23T02:05:08.0046898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0047073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0047449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0047638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0047889Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp239hm4zi 2022-11-23T02:05:08.0048133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp239hm4zi/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0048171Z 2022-11-23T02:05:08.0048265Z Running tests... 2022-11-23T02:05:08.0048533Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0048842Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0049101Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0049318Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19143 2022-11-23T02:05:08.0049532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19144 2022-11-23T02:05:08.0049743Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19145 2022-11-23T02:05:08.0049933Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19146 2022-11-23T02:05:08.0050298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0050474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0050913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0051112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0051474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0051646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0052015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0052203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0052548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0052721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0053142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0053329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0053689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0053864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0054289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0054496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0054735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2x8ft3ee 2022-11-23T02:05:08.0055006Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2x8ft3ee/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0055259Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptao77hox 2022-11-23T02:05:08.0055531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptao77hox/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0055756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0055982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0056232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9niss7vr 2022-11-23T02:05:08.0056491Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9niss7vr/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0056740Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1shvm4wo 2022-11-23T02:05:08.0056983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1shvm4wo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0057208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0057435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0057538Z ok (6.038s) 2022-11-23T02:05:08.0057558Z 2022-11-23T02:05:08.0057829Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0057941Z Ran 1 test in 6.038s 2022-11-23T02:05:08.0057960Z 2022-11-23T02:05:08.0058053Z OK 2022-11-23T02:05:08.0058072Z 2022-11-23T02:05:08.0058195Z Generating XML reports... 2022-11-23T02:05:08.0058605Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015952.xml 2022-11-23T02:05:08.0058973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0059151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0059527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0059769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0060028Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw6cs971i 2022-11-23T02:05:08.0060294Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw6cs971i/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0060314Z 2022-11-23T02:05:08.0060421Z Running tests... 2022-11-23T02:05:08.0060687Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0060977Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0061252Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0061466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19490 2022-11-23T02:05:08.0061684Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19491 2022-11-23T02:05:08.0061950Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19492 2022-11-23T02:05:08.0062159Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19493 2022-11-23T02:05:08.0062530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0062705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0063066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0063259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0063620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0063790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0064435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0064631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0064998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0065171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0065596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0065787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0066151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0066324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0066690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0066885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0067141Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4jxt8nw5 2022-11-23T02:05:08.0067407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4jxt8nw5/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0067634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0067868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpms676ygq 2022-11-23T02:05:08.0068133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpms676ygq/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0068386Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpim6c3sal 2022-11-23T02:05:08.0068647Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpim6c3sal/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0068945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0069175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0069429Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf96zxpur 2022-11-23T02:05:08.0069694Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf96zxpur/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0069919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0070001Z ok (6.079s) 2022-11-23T02:05:08.0070025Z 2022-11-23T02:05:08.0070298Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0070408Z Ran 1 test in 6.080s 2022-11-23T02:05:08.0070428Z 2022-11-23T02:05:08.0070521Z OK 2022-11-23T02:05:08.0070540Z 2022-11-23T02:05:08.0070664Z Generating XML reports... 2022-11-23T02:05:08.0071246Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020000.xml 2022-11-23T02:05:08.0071613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0071788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0072144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0072338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0072591Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkgak24c1 2022-11-23T02:05:08.0072856Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkgak24c1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0072876Z 2022-11-23T02:05:08.0072983Z Running tests... 2022-11-23T02:05:08.0073247Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0073563Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0073833Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0074055Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19837 2022-11-23T02:05:08.0074250Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19838 2022-11-23T02:05:08.0074464Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19839 2022-11-23T02:05:08.0074677Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19840 2022-11-23T02:05:08.0075042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0075215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0075587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0075784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0076154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0076311Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0076679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0076869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0077228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0077404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0077771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0078084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0078456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0078627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0078980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0079167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0079421Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpivj65i18 2022-11-23T02:05:08.0079686Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpivj65i18/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0079938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpper69ihu 2022-11-23T02:05:08.0080249Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpper69ihu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0080506Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8sxtwrve 2022-11-23T02:05:08.0080773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8sxtwrve/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0081023Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptmordxka 2022-11-23T02:05:08.0081272Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptmordxka/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0081498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0081719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0081945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0082168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0082273Z ok (4.236s) 2022-11-23T02:05:08.0082296Z 2022-11-23T02:05:08.0082567Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0082679Z Ran 1 test in 4.237s 2022-11-23T02:05:08.0082698Z 2022-11-23T02:05:08.0082770Z OK 2022-11-23T02:05:08.0082789Z 2022-11-23T02:05:08.0082911Z Generating XML reports... 2022-11-23T02:05:08.0083341Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020009.xml 2022-11-23T02:05:08.0083708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0083882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0084255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0084443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0084701Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4z3q7sj9 2022-11-23T02:05:08.0084949Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4z3q7sj9/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0084988Z 2022-11-23T02:05:08.0085077Z Running tests... 2022-11-23T02:05:08.0085339Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0085650Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0085900Z test_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0086119Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20180 2022-11-23T02:05:08.0086335Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20181 2022-11-23T02:05:08.0086544Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20182 2022-11-23T02:05:08.0086811Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20183 2022-11-23T02:05:08.0087173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0087347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0087718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0087906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0088265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0088439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0088808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0089050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0089393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0089568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0089942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0090125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0090479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0090649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0091017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0091205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0091467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmsb0x2xv 2022-11-23T02:05:08.0091716Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmsb0x2xv/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0091965Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoxqxoh5s 2022-11-23T02:05:08.0092231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoxqxoh5s/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0092457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0092677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0092931Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphh7ap5tj 2022-11-23T02:05:08.0093202Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphh7ap5tj/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0093433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0093666Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoju36ykl 2022-11-23T02:05:08.0093927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoju36ykl/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0094147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0094247Z ok (4.331s) 2022-11-23T02:05:08.0094267Z 2022-11-23T02:05:08.0094536Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0094647Z Ran 1 test in 4.332s 2022-11-23T02:05:08.0094666Z 2022-11-23T02:05:08.0094756Z OK 2022-11-23T02:05:08.0094775Z 2022-11-23T02:05:08.0094898Z Generating XML reports... 2022-11-23T02:05:08.0095327Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020016.xml 2022-11-23T02:05:08.0095729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0095912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0096290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0096483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0096737Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsrvzbp44 2022-11-23T02:05:08.0097002Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsrvzbp44/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0097022Z 2022-11-23T02:05:08.0097131Z Running tests... 2022-11-23T02:05:08.0097393Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0097684Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0097997Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0098212Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20523 2022-11-23T02:05:08.0098429Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20524 2022-11-23T02:05:08.0098642Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20525 2022-11-23T02:05:08.0098852Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20526 2022-11-23T02:05:08.0099222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0099397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0099773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0099945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0100314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0100488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0100859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0101047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0101411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0101583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0101954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0102122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0102496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0102668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0103036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0103219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0103473Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm8ejzev8 2022-11-23T02:05:08.0103740Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm8ejzev8/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0104268Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpicd7fhg7 2022-11-23T02:05:08.0104547Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpicd7fhg7/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0104759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0105057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0105317Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6l7r04b5 2022-11-23T02:05:08.0105580Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6l7r04b5/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0105805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0106058Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptvbbi_6d 2022-11-23T02:05:08.0106321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptvbbi_6d/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0106547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0106770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:08.0107071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:05:08.0107313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:05:08.0107549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:05:08.0107953Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:08.0108345Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:08.0108738Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:08.0109121Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:05:08.0109867Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:08.0109981Z warnings.warn( 2022-11-23T02:05:08.0110710Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:08.0110803Z warnings.warn( 2022-11-23T02:05:08.0111525Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:08.0111639Z warnings.warn( 2022-11-23T02:05:08.0112359Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:05:08.0112470Z warnings.warn( 2022-11-23T02:05:08.0112572Z ok (4.330s) 2022-11-23T02:05:08.0112592Z 2022-11-23T02:05:08.0112856Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0112968Z Ran 1 test in 4.330s 2022-11-23T02:05:08.0112987Z 2022-11-23T02:05:08.0113079Z OK 2022-11-23T02:05:08.0113098Z 2022-11-23T02:05:08.0113203Z Generating XML reports... 2022-11-23T02:05:08.0113631Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020022.xml 2022-11-23T02:05:08.0113997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0114236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0114625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0114816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0115073Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphnvyt19d 2022-11-23T02:05:08.0115342Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphnvyt19d/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0115362Z 2022-11-23T02:05:08.0115470Z Running tests... 2022-11-23T02:05:08.0115717Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0116027Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0116292Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0116560Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20866 2022-11-23T02:05:08.0116775Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20867 2022-11-23T02:05:08.0116990Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20868 2022-11-23T02:05:08.0117204Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20869 2022-11-23T02:05:08.0117573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0117730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0118106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0118294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0118664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0118837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0119206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0119392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0119748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0119904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0120271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0120455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0120808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0120986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0121361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0121545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0121801Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkwju23h7 2022-11-23T02:05:08.0122068Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkwju23h7/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0122300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2yx_3yet 2022-11-23T02:05:08.0122562Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2yx_3yet/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0122815Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpue5p2l7m 2022-11-23T02:05:08.0123130Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpue5p2l7m/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0123363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0123590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0123816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0124065Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4o8280lf 2022-11-23T02:05:08.0124326Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4o8280lf/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0124531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0124634Z ok (4.247s) 2022-11-23T02:05:08.0124654Z 2022-11-23T02:05:08.0124927Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0125088Z Ran 1 test in 4.247s 2022-11-23T02:05:08.0125111Z 2022-11-23T02:05:08.0125214Z OK 2022-11-23T02:05:08.0125232Z 2022-11-23T02:05:08.0125363Z Generating XML reports... 2022-11-23T02:05:08.0125794Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020029.xml 2022-11-23T02:05:08.0126142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0126314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0126686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0126874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0127122Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcqy1fwo1 2022-11-23T02:05:08.0127389Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcqy1fwo1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0127411Z 2022-11-23T02:05:08.0127517Z Running tests... 2022-11-23T02:05:08.0127783Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0128070Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0128332Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0128551Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21209 2022-11-23T02:05:08.0128768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21210 2022-11-23T02:05:08.0128978Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21211 2022-11-23T02:05:08.0129187Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21212 2022-11-23T02:05:08.0129553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0129732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0130110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0130282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0130641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0130813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0131182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0131367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0131728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0131951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0132330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0132496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0132852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0133023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0133395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0133580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0133831Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5w6fy7y1 2022-11-23T02:05:08.0134148Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5w6fy7y1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0134372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0134624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprv0psq99 2022-11-23T02:05:08.0134872Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprv0psq99/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0135096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0135343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiig58rcx 2022-11-23T02:05:08.0135607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiig58rcx/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0135855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9k5jbxve 2022-11-23T02:05:08.0136116Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9k5jbxve/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0136342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0136564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0136646Z ok (4.327s) 2022-11-23T02:05:08.0136665Z 2022-11-23T02:05:08.0136935Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0137045Z Ran 1 test in 4.327s 2022-11-23T02:05:08.0137064Z 2022-11-23T02:05:08.0137153Z OK 2022-11-23T02:05:08.0137172Z 2022-11-23T02:05:08.0137294Z Generating XML reports... 2022-11-23T02:05:08.0137723Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020036.xml 2022-11-23T02:05:08.0138089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0138264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0138643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0138814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0139063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3n325hrc 2022-11-23T02:05:08.0139324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3n325hrc/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0139344Z 2022-11-23T02:05:08.0139448Z Running tests... 2022-11-23T02:05:08.0139708Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0140012Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0140279Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0140498Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21552 2022-11-23T02:05:08.0140745Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21553 2022-11-23T02:05:08.0140966Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21554 2022-11-23T02:05:08.0141179Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21555 2022-11-23T02:05:08.0141546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0141720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0142094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0142285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0142645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0142864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0143222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0143409Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0143769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0144309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0144687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0144872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0145224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0145402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0145760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0145947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0146198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcknahxjb 2022-11-23T02:05:08.0146466Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcknahxjb/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0146720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl4o40ym2 2022-11-23T02:05:08.0146980Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl4o40ym2/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0147205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0147451Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_av_2yrm 2022-11-23T02:05:08.0147715Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_av_2yrm/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0148025Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdx2z2kr_ 2022-11-23T02:05:08.0148284Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdx2z2kr_/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0148505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0148727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0149036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0149139Z ok (6.046s) 2022-11-23T02:05:08.0149158Z 2022-11-23T02:05:08.0149427Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0149538Z Ran 1 test in 6.046s 2022-11-23T02:05:08.0149562Z 2022-11-23T02:05:08.0149637Z OK 2022-11-23T02:05:08.0149658Z 2022-11-23T02:05:08.0149852Z Generating XML reports... 2022-11-23T02:05:08.0150298Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020043.xml 2022-11-23T02:05:08.0150662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0150835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0151208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0151400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0151652Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0545af56 2022-11-23T02:05:08.0151915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0545af56/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0151992Z 2022-11-23T02:05:08.0152085Z Running tests... 2022-11-23T02:05:08.0152350Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0152658Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0152920Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0153136Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21899 2022-11-23T02:05:08.0153351Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21900 2022-11-23T02:05:08.0153560Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21901 2022-11-23T02:05:08.0153768Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21902 2022-11-23T02:05:08.0154116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0154297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0154670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0154860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0155218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0155390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0155757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0155942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0156295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0156448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0156829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0157014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0157370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0157538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0157903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0158090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0158344Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp72znk0yg 2022-11-23T02:05:08.0158593Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp72znk0yg/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0158821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0159120Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdpdwbnu5 2022-11-23T02:05:08.0159374Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbadj81uw 2022-11-23T02:05:08.0159637Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdpdwbnu5/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0159891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbadj81uw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0160111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0160333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0160583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxau1qkpe 2022-11-23T02:05:08.0160828Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxau1qkpe/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0161101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0161200Z ok (4.630s) 2022-11-23T02:05:08.0161219Z 2022-11-23T02:05:08.0161485Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0161595Z Ran 1 test in 4.630s 2022-11-23T02:05:08.0161615Z 2022-11-23T02:05:08.0161704Z OK 2022-11-23T02:05:08.0161723Z 2022-11-23T02:05:08.0161843Z Generating XML reports... 2022-11-23T02:05:08.0162270Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020051.xml 2022-11-23T02:05:08.0162616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0162788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0163160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0163352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0163607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1t9b2j9d 2022-11-23T02:05:08.0163871Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1t9b2j9d/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0163891Z 2022-11-23T02:05:08.0163996Z Running tests... 2022-11-23T02:05:08.0164263Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0164577Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0164808Z test_allreduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0165026Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22266 2022-11-23T02:05:08.0165241Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22267 2022-11-23T02:05:08.0165513Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22268 2022-11-23T02:05:08.0165771Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22269 2022-11-23T02:05:08.0166147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0166625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0167179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0167648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0168223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0168648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0169269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0169742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0170315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0170742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0171309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0171765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0172335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0172761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0173330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0173844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0174284Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2fg9vu78 2022-11-23T02:05:08.0174901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2fg9vu78/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0175405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0175902Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6myp0rr 2022-11-23T02:05:08.0176414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6myp0rr/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0176935Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4bnwkjch 2022-11-23T02:05:08.0177452Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4bnwkjch/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0177951Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0178396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0178884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy_wevln2 2022-11-23T02:05:08.0179640Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy_wevln2/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0180144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0180465Z ok (4.533s) 2022-11-23T02:05:08.0180617Z 2022-11-23T02:05:08.0180889Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0181213Z Ran 1 test in 4.533s 2022-11-23T02:05:08.0181373Z 2022-11-23T02:05:08.0181446Z OK 2022-11-23T02:05:08.0181583Z 2022-11-23T02:05:08.0181706Z Generating XML reports... 2022-11-23T02:05:08.0182304Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020058.xml 2022-11-23T02:05:08.0183009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0183448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0184303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0184774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0185232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmcnrxmvj 2022-11-23T02:05:08.0185750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmcnrxmvj/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0186051Z 2022-11-23T02:05:08.0186158Z Running tests... 2022-11-23T02:05:08.0186568Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0187172Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0187682Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0188162Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22633 2022-11-23T02:05:08.0188603Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22634 2022-11-23T02:05:08.0189027Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22635 2022-11-23T02:05:08.0189458Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22636 2022-11-23T02:05:08.0190060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0190503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0191050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0191588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0192161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0192579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0193139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0193610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0194182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0194597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0195160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0195638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0196224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0196672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0197216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0197671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0198130Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptz89lkto 2022-11-23T02:05:08.0198648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptz89lkto/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0199174Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5nbfufsc 2022-11-23T02:05:08.0199706Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5nbfufsc/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0200211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0200659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0201145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb_ly4epk 2022-11-23T02:05:08.0201672Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb_ly4epk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0202152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0202643Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy75si0j3 2022-11-23T02:05:08.0203167Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy75si0j3/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0203664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0203989Z ok (6.413s) 2022-11-23T02:05:08.0204230Z 2022-11-23T02:05:08.0204520Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0204865Z Ran 1 test in 6.413s 2022-11-23T02:05:08.0205030Z 2022-11-23T02:05:08.0205104Z OK 2022-11-23T02:05:08.0205241Z 2022-11-23T02:05:08.0205365Z Generating XML reports... 2022-11-23T02:05:08.0205962Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020105.xml 2022-11-23T02:05:08.0206648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0207092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0207641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0208103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0208622Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkut9kx5e 2022-11-23T02:05:08.0209149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkut9kx5e/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0209431Z 2022-11-23T02:05:08.0209537Z Running tests... 2022-11-23T02:05:08.0209936Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0210458Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0210930Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0211404Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23004 2022-11-23T02:05:08.0211850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23005 2022-11-23T02:05:08.0212292Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23006 2022-11-23T02:05:08.0212764Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23007 2022-11-23T02:05:08.0213368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0213826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0214392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0214856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0215410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0215854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0216420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0216898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0217483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0217928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0218499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0218959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0219544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0219992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0220561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0221012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0221536Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6_dqvh9c 2022-11-23T02:05:08.0222083Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6_dqvh9c/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0222620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzz3vntjf 2022-11-23T02:05:08.0223143Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzz3vntjf/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0223679Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp831b7j9l 2022-11-23T02:05:08.0224564Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp831b7j9l/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0225057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0225531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0226085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0226589Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsvj21zv1 2022-11-23T02:05:08.0227107Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsvj21zv1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0227615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0227959Z ok (4.220s) 2022-11-23T02:05:08.0228106Z 2022-11-23T02:05:08.0228367Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0228697Z Ran 1 test in 4.220s 2022-11-23T02:05:08.0228858Z 2022-11-23T02:05:08.0228950Z OK 2022-11-23T02:05:08.0229085Z 2022-11-23T02:05:08.0229210Z Generating XML reports... 2022-11-23T02:05:08.0229783Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020114.xml 2022-11-23T02:05:08.0230494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0230943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0231502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0231973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0232436Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpefmt_kfk 2022-11-23T02:05:08.0232977Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpefmt_kfk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0233275Z 2022-11-23T02:05:08.0233365Z Running tests... 2022-11-23T02:05:08.0233772Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0234305Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0234789Z test_broadcast_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0235277Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23347 2022-11-23T02:05:08.0235728Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23348 2022-11-23T02:05:08.0236179Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23349 2022-11-23T02:05:08.0236608Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23350 2022-11-23T02:05:08.0237221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0237683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0238270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0238723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0239376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0239846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0240407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0240877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0241449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0241895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0242449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0242912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0243547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0243970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0244541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0244999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0245514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7pjoelia 2022-11-23T02:05:08.0246042Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7pjoelia/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0246555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0247062Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_55y_3c9 2022-11-23T02:05:08.0247593Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_55y_3c9/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0248109Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcmlryx6z 2022-11-23T02:05:08.0248646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcmlryx6z/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0249178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8jgmpota 2022-11-23T02:05:08.0249697Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8jgmpota/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0250206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0250680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0251153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0251480Z ok (4.203s) 2022-11-23T02:05:08.0251627Z 2022-11-23T02:05:08.0251904Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0252236Z Ran 1 test in 4.203s 2022-11-23T02:05:08.0252396Z 2022-11-23T02:05:08.0252471Z OK 2022-11-23T02:05:08.0252604Z 2022-11-23T02:05:08.0252728Z Generating XML reports... 2022-11-23T02:05:08.0253315Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020121.xml 2022-11-23T02:05:08.0254018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0254451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0255024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0255494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0255945Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjylib716 2022-11-23T02:05:08.0256547Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjylib716/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0256854Z 2022-11-23T02:05:08.0256963Z Running tests... 2022-11-23T02:05:08.0257369Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0257881Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0258388Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0258877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23690 2022-11-23T02:05:08.0259310Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23691 2022-11-23T02:05:08.0259763Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23692 2022-11-23T02:05:08.0260208Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23693 2022-11-23T02:05:08.0260909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0261347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0261922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0262395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0262953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0263399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0264231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0264758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0265324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0265855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0266436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0266899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0267456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0267903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0268469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0268935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0269399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp00o37xlu 2022-11-23T02:05:08.0269951Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp00o37xlu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0270486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf5bvik6v 2022-11-23T02:05:08.0271008Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf5bvik6v/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0271522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0271992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0272499Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc7z2zheo 2022-11-23T02:05:08.0273019Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc7z2zheo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0273556Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqrv1g4vt 2022-11-23T02:05:08.0274170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqrv1g4vt/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0274669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0275136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0275481Z ok (6.150s) 2022-11-23T02:05:08.0275631Z 2022-11-23T02:05:08.0275906Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0276213Z Ran 1 test in 6.150s 2022-11-23T02:05:08.0276428Z 2022-11-23T02:05:08.0276520Z OK 2022-11-23T02:05:08.0276652Z 2022-11-23T02:05:08.0276777Z Generating XML reports... 2022-11-23T02:05:08.0277344Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020128.xml 2022-11-23T02:05:08.0278049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0278571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0279154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0279612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0280076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmmio4yfo 2022-11-23T02:05:08.0280613Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmmio4yfo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0280912Z 2022-11-23T02:05:08.0281003Z Running tests... 2022-11-23T02:05:08.0281404Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0281936Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0282433Z test_broadcast_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0282902Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24037 2022-11-23T02:05:08.0283359Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24038 2022-11-23T02:05:08.0283807Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24039 2022-11-23T02:05:08.0284237Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24040 2022-11-23T02:05:08.0284843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0285293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0285869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0286325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0286901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0287356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0287915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0288390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0288963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0289417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0289974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0290453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0291037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0291495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0292193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0292675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0293151Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj369gc9m 2022-11-23T02:05:08.0293675Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj369gc9m/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0294226Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg6r2wb99 2022-11-23T02:05:08.0294762Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg6r2wb99/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0295271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0295751Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp055vjynb 2022-11-23T02:05:08.0296333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdx_3amrp 2022-11-23T02:05:08.0296864Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp055vjynb/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0297390Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdx_3amrp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0297911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0298395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0298865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0299196Z ok (4.281s) 2022-11-23T02:05:08.0299350Z 2022-11-23T02:05:08.0299631Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0299974Z Ran 1 test in 4.282s 2022-11-23T02:05:08.0300141Z 2022-11-23T02:05:08.0300215Z OK 2022-11-23T02:05:08.0300360Z 2022-11-23T02:05:08.0300490Z Generating XML reports... 2022-11-23T02:05:08.0301088Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020136.xml 2022-11-23T02:05:08.0301798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0302234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0302823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0303305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0303781Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplo4du923 2022-11-23T02:05:08.0304597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplo4du923/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0304909Z 2022-11-23T02:05:08.0305024Z Running tests... 2022-11-23T02:05:08.0305449Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0305971Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0306481Z test_broadcast_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0306967Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24380 2022-11-23T02:05:08.0307418Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24381 2022-11-23T02:05:08.0307852Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24382 2022-11-23T02:05:08.0308299Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24383 2022-11-23T02:05:08.0308918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0309358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0310031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0310522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0311110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0311541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0312115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0312591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0313149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0313608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0314262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0314737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0315300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0346124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0346814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0347281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0347727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_vfhjuh3 2022-11-23T02:05:08.0348235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_vfhjuh3/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0348754Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0kyhaz9h 2022-11-23T02:05:08.0349269Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0kyhaz9h/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0349757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0350230Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppi8s3msp 2022-11-23T02:05:08.0350752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppi8s3msp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0351237Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0351681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0352151Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx318mrua 2022-11-23T02:05:08.0352660Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx318mrua/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0353148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0353461Z ok (4.291s) 2022-11-23T02:05:08.0353597Z 2022-11-23T02:05:08.0353866Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0354174Z Ran 1 test in 4.291s 2022-11-23T02:05:08.0354407Z 2022-11-23T02:05:08.0354488Z OK 2022-11-23T02:05:08.0354605Z 2022-11-23T02:05:08.0354845Z Generating XML reports... 2022-11-23T02:05:08.0355412Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020143.xml 2022-11-23T02:05:08.0356089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0356511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0357058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0357651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0358111Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppo4ve27h 2022-11-23T02:05:08.0358624Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppo4ve27h/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0358913Z 2022-11-23T02:05:08.0359009Z Running tests... 2022-11-23T02:05:08.0359396Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0359897Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0360381Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0360844Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24747 2022-11-23T02:05:08.0361268Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24748 2022-11-23T02:05:08.0361791Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24749 2022-11-23T02:05:08.0362219Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24750 2022-11-23T02:05:08.0362825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0363342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0363911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0364360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0364931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0365382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0366097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0366567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0367143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0367572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0368139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0368602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0369157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0369602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0370171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0370640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0371085Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu2wbqtki 2022-11-23T02:05:08.0371622Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu2wbqtki/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0372132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0372615Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdgfypx_u 2022-11-23T02:05:08.0373156Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdgfypx_u/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0373692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx5eqdebp 2022-11-23T02:05:08.0374227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx5eqdebp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0374911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0375389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0375883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_uhl0ds 2022-11-23T02:05:08.0376415Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_uhl0ds/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0376900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0377242Z ok (6.343s) 2022-11-23T02:05:08.0377392Z 2022-11-23T02:05:08.0377667Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0377980Z Ran 1 test in 6.343s 2022-11-23T02:05:08.0378141Z 2022-11-23T02:05:08.0378234Z OK 2022-11-23T02:05:08.0378376Z 2022-11-23T02:05:08.0378499Z Generating XML reports... 2022-11-23T02:05:08.0379134Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020150.xml 2022-11-23T02:05:08.0379837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0380287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0380863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0381313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0381784Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqp61fgzk 2022-11-23T02:05:08.0382331Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqp61fgzk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0382634Z 2022-11-23T02:05:08.0382749Z Running tests... 2022-11-23T02:05:08.0383135Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0383674Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0384470Z test_empty_tensors (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0384936Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25118 2022-11-23T02:05:08.0385386Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25119 2022-11-23T02:05:08.0385833Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25120 2022-11-23T02:05:08.0386292Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25121 2022-11-23T02:05:08.0386886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0387340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0387901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0388336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0388912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0389385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0389976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0390426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0391001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0391451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0392027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0392574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0393169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0393613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0394161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0394632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0395100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvms6typk 2022-11-23T02:05:08.0395645Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvms6typk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0396161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp20j9r06l 2022-11-23T02:05:08.0396771Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp20j9r06l/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0397307Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpesdbdqn1 2022-11-23T02:05:08.0397845Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpesdbdqn1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0398333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0398820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0399300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0399772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2ei343q0 2022-11-23T02:05:08.0400309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2ei343q0/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0400818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0401168Z ok (4.239s) 2022-11-23T02:05:08.0401304Z 2022-11-23T02:05:08.0401579Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0401916Z Ran 1 test in 4.240s 2022-11-23T02:05:08.0402115Z 2022-11-23T02:05:08.0402209Z OK 2022-11-23T02:05:08.0402345Z 2022-11-23T02:05:08.0402450Z Generating XML reports... 2022-11-23T02:05:08.0403042Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020158.xml 2022-11-23T02:05:08.0403743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0404195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0404755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0405230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0405705Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp78r215q6 2022-11-23T02:05:08.0406240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp78r215q6/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0406518Z 2022-11-23T02:05:08.0406632Z Running tests... 2022-11-23T02:05:08.0407041Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0407584Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0408176Z test_gather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0408652Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25461 2022-11-23T02:05:08.0409108Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25462 2022-11-23T02:05:08.0409561Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25463 2022-11-23T02:05:08.0410044Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25464 2022-11-23T02:05:08.0410651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0411110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0411666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0412144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0412732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0413180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0413740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0414265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0414855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0415304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0415854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0416327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0416904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0417340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0417916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0418383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0418858Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxs8am0yf 2022-11-23T02:05:08.0419383Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxs8am0yf/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0419926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0420434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpayaigigh 2022-11-23T02:05:08.0420941Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi1d_o1r7 2022-11-23T02:05:08.0421479Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpayaigigh/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0422021Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi1d_o1r7/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0422548Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphg5809n9 2022-11-23T02:05:08.0423070Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphg5809n9/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0423589Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0424324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0424807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0425130Z ok (4.200s) 2022-11-23T02:05:08.0425280Z 2022-11-23T02:05:08.0425560Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0425902Z Ran 1 test in 4.200s 2022-11-23T02:05:08.0426067Z 2022-11-23T02:05:08.0426145Z OK 2022-11-23T02:05:08.0426309Z 2022-11-23T02:05:08.0426437Z Generating XML reports... 2022-11-23T02:05:08.0427037Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020205.xml 2022-11-23T02:05:08.0427826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0428270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0428852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0429322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0429777Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjqzvpuhy 2022-11-23T02:05:08.0430330Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjqzvpuhy/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0430633Z 2022-11-23T02:05:08.0430741Z Running tests... 2022-11-23T02:05:08.0431149Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0431671Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0432248Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0432734Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25804 2022-11-23T02:05:08.0433169Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25805 2022-11-23T02:05:08.0433624Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25806 2022-11-23T02:05:08.0434097Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25807 2022-11-23T02:05:08.0434717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0435158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0435727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0436171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0436761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0437217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0437815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0438288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0438845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0439365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0439942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0440414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0440978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0441429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0442000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0442480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0442934Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkg2c1oxg 2022-11-23T02:05:08.0443481Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkg2c1oxg/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0443996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0444475Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9hf_2kdu 2022-11-23T02:05:08.0445023Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9hf_2kdu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0445614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpssi7m99v 2022-11-23T02:05:08.0445887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpssi7m99v/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0446147Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn171hpxr 2022-11-23T02:05:08.0446378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0446625Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn171hpxr/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0446853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0447086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0447191Z ok (6.120s) 2022-11-23T02:05:08.0447211Z 2022-11-23T02:05:08.0447539Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0447662Z Ran 1 test in 6.120s 2022-11-23T02:05:08.0447682Z 2022-11-23T02:05:08.0447784Z OK 2022-11-23T02:05:08.0447803Z 2022-11-23T02:05:08.0447931Z Generating XML reports... 2022-11-23T02:05:08.0448345Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020212.xml 2022-11-23T02:05:08.0448722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0448902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0449287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0449479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0449733Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0vy927nk 2022-11-23T02:05:08.0450014Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0vy927nk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0450034Z 2022-11-23T02:05:08.0450143Z Running tests... 2022-11-23T02:05:08.0450417Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0450710Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0450954Z test_gather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0451238Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26151 2022-11-23T02:05:08.0451458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26152 2022-11-23T02:05:08.0451671Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26153 2022-11-23T02:05:08.0451889Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26154 2022-11-23T02:05:08.0452267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0452447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0452825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0452998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0453373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0453549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0453931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0454121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0454491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0454777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0455172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0455342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0455709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0455886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0456264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0456460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0456718Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa86vp6q1 2022-11-23T02:05:08.0457051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa86vp6q1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0457305Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0tqeu49p 2022-11-23T02:05:08.0457580Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0tqeu49p/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0457809Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_wbw6gyq 2022-11-23T02:05:08.0458074Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_wbw6gyq/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0458330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc6sfsmpi 2022-11-23T02:05:08.0458592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc6sfsmpi/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0458820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0459051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0459278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0459503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0459586Z ok (4.363s) 2022-11-23T02:05:08.0459627Z 2022-11-23T02:05:08.0459881Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0459994Z Ran 1 test in 4.363s 2022-11-23T02:05:08.0460014Z 2022-11-23T02:05:08.0460108Z OK 2022-11-23T02:05:08.0460127Z 2022-11-23T02:05:08.0460258Z Generating XML reports... 2022-11-23T02:05:08.0460690Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020220.xml 2022-11-23T02:05:08.0461064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0461244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0461626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0461798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0462058Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2f2m21mm 2022-11-23T02:05:08.0462324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2f2m21mm/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0462344Z 2022-11-23T02:05:08.0462460Z Running tests... 2022-11-23T02:05:08.0462723Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0463036Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0463301Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0463524Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26494 2022-11-23T02:05:08.0463779Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26495 2022-11-23T02:05:08.0464264Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26496 2022-11-23T02:05:08.0464494Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26497 2022-11-23T02:05:08.0464877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0465061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0465441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0465681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0466099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0466363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0466721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0466918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0467292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0467469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0467842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0468037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0468400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0468580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0468939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0469138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0469403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6kuvbo6h 2022-11-23T02:05:08.0469671Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6kuvbo6h/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0469901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0470157Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvo4eydv 2022-11-23T02:05:08.0470430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvo4eydv/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0470680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfi_14oj4 2022-11-23T02:05:08.0470953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfi_14oj4/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0471184Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxbq39ure 2022-11-23T02:05:08.0471445Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxbq39ure/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0471672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0471901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0472126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0472231Z ok (4.220s) 2022-11-23T02:05:08.0472251Z 2022-11-23T02:05:08.0472529Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0472645Z Ran 1 test in 4.220s 2022-11-23T02:05:08.0472668Z 2022-11-23T02:05:08.0472762Z OK 2022-11-23T02:05:08.0472782Z 2022-11-23T02:05:08.0472953Z Generating XML reports... 2022-11-23T02:05:08.0473405Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020227.xml 2022-11-23T02:05:08.0473808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0473985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0474360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0474552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0474805Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1g7b3smu 2022-11-23T02:05:08.0475201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1g7b3smu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0475269Z 2022-11-23T02:05:08.0475361Z Running tests... 2022-11-23T02:05:08.0475632Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0475953Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0476195Z test_gather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0476414Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26837 2022-11-23T02:05:08.0476633Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26838 2022-11-23T02:05:08.0476848Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26839 2022-11-23T02:05:08.0477085Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26840 2022-11-23T02:05:08.0477459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0477622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0478008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0478200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0478569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0478745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0479120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0479307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0479674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0479830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0480211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0480408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0480781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0480957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0481335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0481525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0481782Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc_4cw_em 2022-11-23T02:05:08.0482054Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc_4cw_em/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0482317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0482575Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7dl89g1c 2022-11-23T02:05:08.0482842Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7dl89g1c/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0483069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0483325Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6ivh1h79 2022-11-23T02:05:08.0483592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6ivh1h79/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0483825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0484076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmlugis31 2022-11-23T02:05:08.0484340Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmlugis31/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0484595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0484707Z ok (4.848s) 2022-11-23T02:05:08.0484727Z 2022-11-23T02:05:08.0484998Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0485113Z Ran 1 test in 4.849s 2022-11-23T02:05:08.0485133Z 2022-11-23T02:05:08.0485227Z OK 2022-11-23T02:05:08.0485246Z 2022-11-23T02:05:08.0485376Z Generating XML reports... 2022-11-23T02:05:08.0485806Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020234.xml 2022-11-23T02:05:08.0486179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0486338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0486714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0486940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0487194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppatj12la 2022-11-23T02:05:08.0487462Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppatj12la/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0487482Z 2022-11-23T02:05:08.0487601Z Running tests... 2022-11-23T02:05:08.0487866Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0488176Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0488427Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0488629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27204 2022-11-23T02:05:08.0488850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27205 2022-11-23T02:05:08.0489073Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27206 2022-11-23T02:05:08.0489288Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27207 2022-11-23T02:05:08.0489663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0489839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0490220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0490409Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0490754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0490938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0491366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0491570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0491940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0492116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0492493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0492691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0493056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0493213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0493593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0493839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0494094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplwn6x1jo 2022-11-23T02:05:08.0494361Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplwn6x1jo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0494595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0494850Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0d7lrgk8 2022-11-23T02:05:08.0495121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0d7lrgk8/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0495375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpax5_5w33 2022-11-23T02:05:08.0495607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyw5hyrph 2022-11-23T02:05:08.0495870Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpax5_5w33/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0496121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyw5hyrph/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0496343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0496568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0496800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0496903Z ok (7.841s) 2022-11-23T02:05:08.0496923Z 2022-11-23T02:05:08.0497193Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0497287Z Ran 1 test in 7.841s 2022-11-23T02:05:08.0497350Z 2022-11-23T02:05:08.0497425Z OK 2022-11-23T02:05:08.0497443Z 2022-11-23T02:05:08.0497569Z Generating XML reports... 2022-11-23T02:05:08.0498006Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020241.xml 2022-11-23T02:05:08.0498380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0498559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0498941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0499132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0499390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt4u2rs5i 2022-11-23T02:05:08.0499634Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt4u2rs5i/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0499654Z 2022-11-23T02:05:08.0499763Z Running tests... 2022-11-23T02:05:08.0500035Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0500393Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0500658Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0500877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27575 2022-11-23T02:05:08.0501101Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27576 2022-11-23T02:05:08.0501315Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27577 2022-11-23T02:05:08.0501509Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27578 2022-11-23T02:05:08.0501882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0502067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0502440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0502663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0503041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0503232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0503609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0503799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0504564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0504752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0505132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0505328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0505716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0505897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0506269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0506464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0506700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfn5sj5h6 2022-11-23T02:05:08.0506971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfn5sj5h6/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0507205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0507464Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpajzgg8jk 2022-11-23T02:05:08.0507735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpajzgg8jk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0507988Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdsr_5qjo 2022-11-23T02:05:08.0508256Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdsr_5qjo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0508505Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr1g7hn1o 2022-11-23T02:05:08.0508767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr1g7hn1o/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0508976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0509216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0509442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0509549Z ok (4.223s) 2022-11-23T02:05:08.0509679Z 2022-11-23T02:05:08.0509967Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0510083Z Ran 1 test in 4.223s 2022-11-23T02:05:08.0510103Z 2022-11-23T02:05:08.0510197Z OK 2022-11-23T02:05:08.0510216Z 2022-11-23T02:05:08.0510355Z Generating XML reports... 2022-11-23T02:05:08.0510792Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020252.xml 2022-11-23T02:05:08.0511142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0511317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0511697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0511890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0512207Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjebl67zo 2022-11-23T02:05:08.0512478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjebl67zo/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0512498Z 2022-11-23T02:05:08.0512608Z Running tests... 2022-11-23T02:05:08.0512874Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0513166Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0513412Z test_reduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0513632Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27922 2022-11-23T02:05:08.0513853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27923 2022-11-23T02:05:08.0514069Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27924 2022-11-23T02:05:08.0514316Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27925 2022-11-23T02:05:08.0514689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0514871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0515247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0515420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0515785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0515965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0516341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0516538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0516902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0517079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0517440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0517612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0517965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0518158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0518530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0518723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0519049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmple2ytgrf 2022-11-23T02:05:08.0519333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmple2ytgrf/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0519586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4k39fa58 2022-11-23T02:05:08.0519860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4k39fa58/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0520071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0520324Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj_nb_nsl 2022-11-23T02:05:08.0520592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj_nb_nsl/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0520841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ktfeqep 2022-11-23T02:05:08.0521158Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ktfeqep/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0521383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0521615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0521840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0521942Z ok (4.327s) 2022-11-23T02:05:08.0521961Z 2022-11-23T02:05:08.0522215Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0522334Z Ran 1 test in 4.328s 2022-11-23T02:05:08.0522353Z 2022-11-23T02:05:08.0522447Z OK 2022-11-23T02:05:08.0522466Z 2022-11-23T02:05:08.0522590Z Generating XML reports... 2022-11-23T02:05:08.0523029Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020258.xml 2022-11-23T02:05:08.0523407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0523589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0523965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0524137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0524396Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6pwsiu8y 2022-11-23T02:05:08.0524662Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6pwsiu8y/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0524682Z 2022-11-23T02:05:08.0524791Z Running tests... 2022-11-23T02:05:08.0525053Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0525366Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0525628Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0525847Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28265 2022-11-23T02:05:08.0526064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28266 2022-11-23T02:05:08.0526261Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28267 2022-11-23T02:05:08.0526476Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28268 2022-11-23T02:05:08.0526853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0527029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0527413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0527605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0528013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0528199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0528558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0528747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0529115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0529292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0529670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0529858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0530283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0530462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0530836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0531005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0531268Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuc0gy9n3 2022-11-23T02:05:08.0531536Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuc0gy9n3/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0531763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0532021Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0sgnfgur 2022-11-23T02:05:08.0532286Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0sgnfgur/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0532547Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfin7uhz0 2022-11-23T02:05:08.0532812Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfin7uhz0/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0533068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbbuxmtt3 2022-11-23T02:05:08.0533316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbbuxmtt3/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0533542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0533765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0533988Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0534094Z ok (6.134s) 2022-11-23T02:05:08.0534114Z 2022-11-23T02:05:08.0534388Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0534503Z Ran 1 test in 6.134s 2022-11-23T02:05:08.0534523Z 2022-11-23T02:05:08.0534615Z OK 2022-11-23T02:05:08.0534634Z 2022-11-23T02:05:08.0534739Z Generating XML reports... 2022-11-23T02:05:08.0535175Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020305.xml 2022-11-23T02:05:08.0535552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0535730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0536107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0536306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0536558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2u8ns_et 2022-11-23T02:05:08.0536876Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2u8ns_et/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0536897Z 2022-11-23T02:05:08.0537011Z Running tests... 2022-11-23T02:05:08.0537259Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0537576Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0537818Z test_reduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0538036Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28612 2022-11-23T02:05:08.0538250Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28613 2022-11-23T02:05:08.0538462Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28614 2022-11-23T02:05:08.0538675Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28615 2022-11-23T02:05:08.0539119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0539278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0539658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0539848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0540208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0540381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0540754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0540943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0541305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0541486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0541842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0542037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0542399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0542572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0542944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0543132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0543387Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprcgexz_i 2022-11-23T02:05:08.0543664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprcgexz_i/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0544155Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt0dfe32r 2022-11-23T02:05:08.0544434Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt0dfe32r/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0544670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0544900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0545149Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1yfybues 2022-11-23T02:05:08.0545413Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1yfybues/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0545637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0545896Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn1gxjyy9 2022-11-23T02:05:08.0546233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn1gxjyy9/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0546448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0546550Z ok (4.232s) 2022-11-23T02:05:08.0546571Z 2022-11-23T02:05:08.0546851Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0546965Z Ran 1 test in 4.232s 2022-11-23T02:05:08.0546985Z 2022-11-23T02:05:08.0547077Z OK 2022-11-23T02:05:08.0547096Z 2022-11-23T02:05:08.0547220Z Generating XML reports... 2022-11-23T02:05:08.0547646Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020314.xml 2022-11-23T02:05:08.0548012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0548233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0548610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0548800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0549052Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_e3d1qp9 2022-11-23T02:05:08.0549318Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_e3d1qp9/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0549337Z 2022-11-23T02:05:08.0549446Z Running tests... 2022-11-23T02:05:08.0549708Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0550021Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0550261Z test_reduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0550466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28955 2022-11-23T02:05:08.0550683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28956 2022-11-23T02:05:08.0550897Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28957 2022-11-23T02:05:08.0551111Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28958 2022-11-23T02:05:08.0551479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0551662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0552043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0552234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0552581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0552764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0553136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0553330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0553695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0553870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0554237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0554426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0554789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0554948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0555426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0555620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0555875Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5wz2line 2022-11-23T02:05:08.0556147Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5wz2line/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0556376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0556629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw7t5ts7k 2022-11-23T02:05:08.0556892Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw7t5ts7k/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0557123Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuuqlnoey 2022-11-23T02:05:08.0557444Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuuqlnoey/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0557692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpevrgue38 2022-11-23T02:05:08.0557954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpevrgue38/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0558182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0558409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0558633Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0558735Z ok (4.527s) 2022-11-23T02:05:08.0558755Z 2022-11-23T02:05:08.0559044Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0559139Z Ran 1 test in 4.527s 2022-11-23T02:05:08.0559161Z 2022-11-23T02:05:08.0559260Z OK 2022-11-23T02:05:08.0559278Z 2022-11-23T02:05:08.0559409Z Generating XML reports... 2022-11-23T02:05:08.0559845Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020320.xml 2022-11-23T02:05:08.0560220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0560406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0560788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0560978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0561216Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpinkyu_qw 2022-11-23T02:05:08.0561481Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpinkyu_qw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0561505Z 2022-11-23T02:05:08.0561624Z Running tests... 2022-11-23T02:05:08.0561892Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0562205Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0562454Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0562672Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29322 2022-11-23T02:05:08.0562888Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29323 2022-11-23T02:05:08.0563102Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29324 2022-11-23T02:05:08.0563297Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29325 2022-11-23T02:05:08.0563663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0563841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0564265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0564527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0564894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0565068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0565441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0565645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0566053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0566230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0566660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0566848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0567210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0567385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0567758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0567949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0568189Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7t8dyer_ 2022-11-23T02:05:08.0568459Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7t8dyer_/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0568724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps4dbb3za 2022-11-23T02:05:08.0568991Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps4dbb3za/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0569240Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpds2ys78a 2022-11-23T02:05:08.0569501Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpds2ys78a/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0569729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0569954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0570162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0570415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpscbakj6f 2022-11-23T02:05:08.0570679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpscbakj6f/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0570908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0571009Z ok (6.944s) 2022-11-23T02:05:08.0571029Z 2022-11-23T02:05:08.0571301Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0571413Z Ran 1 test in 6.944s 2022-11-23T02:05:08.0571433Z 2022-11-23T02:05:08.0571524Z OK 2022-11-23T02:05:08.0571543Z 2022-11-23T02:05:08.0571666Z Generating XML reports... 2022-11-23T02:05:08.0572079Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020327.xml 2022-11-23T02:05:08.0572445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0572621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0572994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0573232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0573489Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprjmoj5rf 2022-11-23T02:05:08.0573756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprjmoj5rf/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0573776Z 2022-11-23T02:05:08.0573883Z Running tests... 2022-11-23T02:05:08.0574130Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0574437Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0574671Z test_round_robin (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0574886Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29693 2022-11-23T02:05:08.0575100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29694 2022-11-23T02:05:08.0575368Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29695 2022-11-23T02:05:08.0575581Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29696 2022-11-23T02:05:08.0575950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0576108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0576487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0576675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0577037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0577210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0577578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0577749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0578117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0578305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0578658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0578846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0579209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0579383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0579754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0579949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0580204Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp376htd7c 2022-11-23T02:05:08.0580459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw88_a2ke 2022-11-23T02:05:08.0580707Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp376htd7c/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0580969Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw88_a2ke/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0581224Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2uj5refr 2022-11-23T02:05:08.0581481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfg4gousx 2022-11-23T02:05:08.0581748Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2uj5refr/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0582092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfg4gousx/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0582328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0582556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0582781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0582985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0583549Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0584380Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0585010Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0585548Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0585658Z ok (4.215s) 2022-11-23T02:05:08.0585679Z 2022-11-23T02:05:08.0585969Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0586089Z Ran 1 test in 4.216s 2022-11-23T02:05:08.0586109Z 2022-11-23T02:05:08.0586202Z OK 2022-11-23T02:05:08.0586221Z 2022-11-23T02:05:08.0586349Z Generating XML reports... 2022-11-23T02:05:08.0586782Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020337.xml 2022-11-23T02:05:08.0587153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0587311Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0587687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0587879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0588134Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjg0l1f6r 2022-11-23T02:05:08.0588408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjg0l1f6r/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0588428Z 2022-11-23T02:05:08.0588537Z Running tests... 2022-11-23T02:05:08.0588803Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0589114Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0589384Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0589584Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30048 2022-11-23T02:05:08.0589804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30049 2022-11-23T02:05:08.0590021Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30050 2022-11-23T02:05:08.0590239Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30051 2022-11-23T02:05:08.0590682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0590870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0591252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0591445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0591792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0591966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0592338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0592528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0592890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0593118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0593496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0593684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0594031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0594204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0594578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0594772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0595031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp93_trmdd 2022-11-23T02:05:08.0595308Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp93_trmdd/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0595567Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpls6f0flu 2022-11-23T02:05:08.0595840Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpls6f0flu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0596071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0596282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0596536Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgx62w262 2022-11-23T02:05:08.0596802Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgx62w262/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0597058Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd9s5tpr4 2022-11-23T02:05:08.0597333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd9s5tpr4/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0597564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0597797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0598354Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0598904Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0599487Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0600032Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0600563Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0601139Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0601661Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0602180Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:05:08.0602265Z ok (4.530s) 2022-11-23T02:05:08.0602307Z 2022-11-23T02:05:08.0602562Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0602677Z Ran 1 test in 4.531s 2022-11-23T02:05:08.0602698Z 2022-11-23T02:05:08.0602790Z OK 2022-11-23T02:05:08.0602809Z 2022-11-23T02:05:08.0602933Z Generating XML reports... 2022-11-23T02:05:08.0603362Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020343.xml 2022-11-23T02:05:08.0603732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0603908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0604286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0604458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0604716Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphqszagxn 2022-11-23T02:05:08.0604990Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphqszagxn/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0605010Z 2022-11-23T02:05:08.0605117Z Running tests... 2022-11-23T02:05:08.0605380Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0605690Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0605935Z test_scatter_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0606154Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30427 2022-11-23T02:05:08.0606351Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30428 2022-11-23T02:05:08.0606567Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30429 2022-11-23T02:05:08.0606779Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30430 2022-11-23T02:05:08.0607196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0607378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0607746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0607921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0608293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0608482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0608836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0609024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0609386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0609612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0609987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0610176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0610536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0610707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0611060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0611248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0611503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxha0koq2 2022-11-23T02:05:08.0611783Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxha0koq2/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0612010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0612263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ftv6m76 2022-11-23T02:05:08.0612512Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_brn0ex1 2022-11-23T02:05:08.0612777Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ftv6m76/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0613036Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_brn0ex1/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0613269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppfn1j1a4 2022-11-23T02:05:08.0613531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppfn1j1a4/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0613764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0613988Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0614211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0614312Z ok (4.342s) 2022-11-23T02:05:08.0614332Z 2022-11-23T02:05:08.0614604Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0614717Z Ran 1 test in 4.342s 2022-11-23T02:05:08.0614736Z 2022-11-23T02:05:08.0614810Z OK 2022-11-23T02:05:08.0614846Z 2022-11-23T02:05:08.0614952Z Generating XML reports... 2022-11-23T02:05:08.0615382Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020350.xml 2022-11-23T02:05:08.0615751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0615928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0616350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0616545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0616797Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzsjwfbwd 2022-11-23T02:05:08.0617068Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzsjwfbwd/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0617088Z 2022-11-23T02:05:08.0617177Z Running tests... 2022-11-23T02:05:08.0617440Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0617748Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0617998Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0618272Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30770 2022-11-23T02:05:08.0618492Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30771 2022-11-23T02:05:08.0618705Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30772 2022-11-23T02:05:08.0618918Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30773 2022-11-23T02:05:08.0619270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0619450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0619829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0620018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0620385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0620566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0620939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0621127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0621488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0621646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0622019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0622208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0622572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0622749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0623126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0623350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0623605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptodcqk0y 2022-11-23T02:05:08.0624072Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptodcqk0y/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0624346Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxcuwjd5k 2022-11-23T02:05:08.0624613Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxcuwjd5k/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0624863Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppvzeb2zx 2022-11-23T02:05:08.0625125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppvzeb2zx/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0625445Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppwy_1qcy 2022-11-23T02:05:08.0625718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppwy_1qcy/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0625949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0626176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0626380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0626606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0626708Z ok (6.038s) 2022-11-23T02:05:08.0626727Z 2022-11-23T02:05:08.0627005Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0627117Z Ran 1 test in 6.038s 2022-11-23T02:05:08.0627193Z 2022-11-23T02:05:08.0627287Z OK 2022-11-23T02:05:08.0627306Z 2022-11-23T02:05:08.0627435Z Generating XML reports... 2022-11-23T02:05:08.0627872Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020357.xml 2022-11-23T02:05:08.0628222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0628399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0628775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0628965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0629219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3j2vw5yu 2022-11-23T02:05:08.0629485Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3j2vw5yu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0629510Z 2022-11-23T02:05:08.0629618Z Running tests... 2022-11-23T02:05:08.0629884Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0630193Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0630419Z test_scatter_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0630636Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31117 2022-11-23T02:05:08.0630853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31118 2022-11-23T02:05:08.0631066Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31119 2022-11-23T02:05:08.0631279Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31120 2022-11-23T02:05:08.0631646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0631825Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0632204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0632376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0632741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0632914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0633287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0633477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0633840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0634014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0634437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0634634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0634979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0635153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0635528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0635715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0635973Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfw7f98tx 2022-11-23T02:05:08.0636244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfw7f98tx/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0636548Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ys4u_d_ 2022-11-23T02:05:08.0636813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ys4u_d_/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0637022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0637250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0637502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo8iuwy_t 2022-11-23T02:05:08.0637767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo8iuwy_t/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0637989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0638243Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvo4gx72u 2022-11-23T02:05:08.0638510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvo4gx72u/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0638739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0638843Z ok (4.426s) 2022-11-23T02:05:08.0638864Z 2022-11-23T02:05:08.0639118Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0639229Z Ran 1 test in 4.427s 2022-11-23T02:05:08.0639249Z 2022-11-23T02:05:08.0639340Z OK 2022-11-23T02:05:08.0639358Z 2022-11-23T02:05:08.0639481Z Generating XML reports... 2022-11-23T02:05:08.0639911Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020406.xml 2022-11-23T02:05:08.0640277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0640451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0640828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0641006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0641261Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpufbxuygm 2022-11-23T02:05:08.0641527Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpufbxuygm/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0641547Z 2022-11-23T02:05:08.0641655Z Running tests... 2022-11-23T02:05:08.0641918Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0642229Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0642472Z test_scatter_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0642690Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31460 2022-11-23T02:05:08.0642908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31461 2022-11-23T02:05:08.0643170Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31462 2022-11-23T02:05:08.0643397Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31463 2022-11-23T02:05:08.0643767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0643942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0644319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0644508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0644869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0645044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0645494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0645685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0646049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0646226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0646598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0646787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0647147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0647322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0647694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0647872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0648129Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5q58vybh 2022-11-23T02:05:08.0648401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5q58vybh/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0648631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0648883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_wc4k05q 2022-11-23T02:05:08.0649144Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_wc4k05q/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0649369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0649619Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0il40e9r 2022-11-23T02:05:08.0649922Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0il40e9r/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0650173Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfjb9jqjz 2022-11-23T02:05:08.0650439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfjb9jqjz/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0650665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0650893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0650999Z ok (4.833s) 2022-11-23T02:05:08.0651019Z 2022-11-23T02:05:08.0651288Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0651402Z Ran 1 test in 4.833s 2022-11-23T02:05:08.0651421Z 2022-11-23T02:05:08.0651515Z OK 2022-11-23T02:05:08.0651537Z 2022-11-23T02:05:08.0651643Z Generating XML reports... 2022-11-23T02:05:08.0652127Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020413.xml 2022-11-23T02:05:08.0652504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0652682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0653056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0653245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0653497Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwxoo_lg4 2022-11-23T02:05:08.0653763Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwxoo_lg4/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0653783Z 2022-11-23T02:05:08.0653873Z Running tests... 2022-11-23T02:05:08.0654140Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0654585Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0654903Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) ... skip: Test is flaky, see https://github.com/pytorch/pytorch/issues/15963 (0.001s) 2022-11-23T02:05:08.0654923Z 2022-11-23T02:05:08.0655185Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0655298Z Ran 1 test in 0.001s 2022-11-23T02:05:08.0655317Z 2022-11-23T02:05:08.0655424Z OK (skipped=1) 2022-11-23T02:05:08.0655443Z 2022-11-23T02:05:08.0655566Z Generating XML reports... 2022-11-23T02:05:08.0655994Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020420.xml 2022-11-23T02:05:08.0656344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0656520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0656905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0657096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0657347Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8gzb6ju8 2022-11-23T02:05:08.0657614Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8gzb6ju8/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0657634Z 2022-11-23T02:05:08.0657742Z Running tests... 2022-11-23T02:05:08.0658009Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0658318Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0658551Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0658768Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31892 2022-11-23T02:05:08.0658992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31893 2022-11-23T02:05:08.0659205Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31894 2022-11-23T02:05:08.0659417Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31895 2022-11-23T02:05:08.0659786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0659962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0660339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0660511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0660874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0661051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0661472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0661669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0662034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0662208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0662581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0662768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0663113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0663409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0663835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0664301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0664560Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpypzctzqm 2022-11-23T02:05:08.0664834Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpypzctzqm/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0665064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0665314Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4fkbynhk 2022-11-23T02:05:08.0665564Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4fkbynhk/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0665902Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpne8o2dvp 2022-11-23T02:05:08.0666177Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpne8o2dvp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0666403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0666628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0666879Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw2ryy9fs 2022-11-23T02:05:08.0667141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw2ryy9fs/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0667364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0667466Z ok (4.238s) 2022-11-23T02:05:08.0667488Z 2022-11-23T02:05:08.0667746Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0667858Z Ran 1 test in 4.238s 2022-11-23T02:05:08.0667877Z 2022-11-23T02:05:08.0667973Z OK 2022-11-23T02:05:08.0667993Z 2022-11-23T02:05:08.0668116Z Generating XML reports... 2022-11-23T02:05:08.0668547Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020422.xml 2022-11-23T02:05:08.0668913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0669092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0669464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0669636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0669888Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp63m34i_s 2022-11-23T02:05:08.0670155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp63m34i_s/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0670178Z 2022-11-23T02:05:08.0670286Z Running tests... 2022-11-23T02:05:08.0670629Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0670952Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0671225Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) ... skip: intermittent failures on Windows, in CI (0.000s) 2022-11-23T02:05:08.0671245Z 2022-11-23T02:05:08.0671503Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0671615Z Ran 1 test in 0.001s 2022-11-23T02:05:08.0671634Z 2022-11-23T02:05:08.0671723Z OK (skipped=1) 2022-11-23T02:05:08.0671742Z 2022-11-23T02:05:08.0671865Z Generating XML reports... 2022-11-23T02:05:08.0672293Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020429.xml 2022-11-23T02:05:08.0672658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0672911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0673289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0673478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0673734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz17a19t_ 2022-11-23T02:05:08.0674001Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz17a19t_/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0674020Z 2022-11-23T02:05:08.0674109Z Running tests... 2022-11-23T02:05:08.0674372Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0674678Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0674947Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0675174Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32300 2022-11-23T02:05:08.0675391Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32301 2022-11-23T02:05:08.0675605Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 32302 2022-11-23T02:05:08.0675821Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 32303 2022-11-23T02:05:08.0676170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0676344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0676721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0676912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0677276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0677462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0677833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0678022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0678369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0678545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0678918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0679106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0679470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0679693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0680076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0680265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0680522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt39jf2pc 2022-11-23T02:05:08.0680773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt39jf2pc/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0681026Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvwzl30tl 2022-11-23T02:05:08.0681294Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvwzl30tl/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0681521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0681800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0682057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4o1c5pe2 2022-11-23T02:05:08.0682327Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4o1c5pe2/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0682553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0682845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeequzba2 2022-11-23T02:05:08.0683143Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeequzba2/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0683533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0683650Z ok (6.258s) 2022-11-23T02:05:08.0683671Z 2022-11-23T02:05:08.0683960Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0684080Z Ran 1 test in 6.259s 2022-11-23T02:05:08.0684100Z 2022-11-23T02:05:08.0684193Z OK 2022-11-23T02:05:08.0684216Z 2022-11-23T02:05:08.0684347Z Generating XML reports... 2022-11-23T02:05:08.0684779Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020431.xml 2022-11-23T02:05:08.0685153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0685334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0685697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0685886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0686137Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0g9ekqaw 2022-11-23T02:05:08.0686408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0g9ekqaw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0686431Z 2022-11-23T02:05:08.0686545Z Running tests... 2022-11-23T02:05:08.0686811Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0687126Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0687386Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0687586Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33019 2022-11-23T02:05:08.0687809Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33020 2022-11-23T02:05:08.0688024Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33021 2022-11-23T02:05:08.0688237Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33022 2022-11-23T02:05:08.0688609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0688842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0689229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0689422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0689787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0689944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0690319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0690508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0690871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0691095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0691476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0691664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0692027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0692184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0692556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0692745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0693002Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0glwt5pu 2022-11-23T02:05:08.0693271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0glwt5pu/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0693530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0u_dpbnr 2022-11-23T02:05:08.0693795Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0u_dpbnr/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0694044Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj5zzn4pp 2022-11-23T02:05:08.0694293Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsyfha4ip 2022-11-23T02:05:08.0694538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj5zzn4pp/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0694796Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsyfha4ip/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0695028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:05:08.0695254Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:05:08.0695487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:05:08.0695710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:05:08.0695811Z ok (4.256s) 2022-11-23T02:05:08.0695831Z 2022-11-23T02:05:08.0696099Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0696193Z Ran 1 test in 4.256s 2022-11-23T02:05:08.0696212Z 2022-11-23T02:05:08.0696304Z OK 2022-11-23T02:05:08.0696323Z 2022-11-23T02:05:08.0696445Z Generating XML reports... 2022-11-23T02:05:08.0696873Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020440.xml 2022-11-23T02:05:08.0697238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0697415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0697841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0698037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0698292Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpslqp53tg 2022-11-23T02:05:08.0698541Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpslqp53tg/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0698561Z 2022-11-23T02:05:08.0698670Z Running tests... 2022-11-23T02:05:08.0698934Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0699245Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0699427Z test_forward_backward (__main__.ReducerTest) ... ok (0.013s) 2022-11-23T02:05:08.0699446Z 2022-11-23T02:05:08.0699710Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0699872Z Ran 1 test in 0.022s 2022-11-23T02:05:08.0699892Z 2022-11-23T02:05:08.0699989Z OK 2022-11-23T02:05:08.0700012Z 2022-11-23T02:05:08.0700117Z Generating XML reports... 2022-11-23T02:05:08.0700516Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020447.xml 2022-11-23T02:05:08.0700887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0701061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0701435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0701625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0701877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6v4vkmiw 2022-11-23T02:05:08.0702146Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6v4vkmiw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0702170Z 2022-11-23T02:05:08.0702283Z Running tests... 2022-11-23T02:05:08.0702532Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0702843Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0703699Z test_forward_backward_optimizer (__main__.ReducerTest) ... [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:05:08.0703802Z ok (0.018s) 2022-11-23T02:05:08.0703821Z 2022-11-23T02:05:08.0704369Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0704489Z Ran 1 test in 0.022s 2022-11-23T02:05:08.0704510Z 2022-11-23T02:05:08.0704606Z OK 2022-11-23T02:05:08.0704627Z 2022-11-23T02:05:08.0704751Z Generating XML reports... 2022-11-23T02:05:08.0705146Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020449.xml 2022-11-23T02:05:08.0705517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0705676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0706052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0706241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0706493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpazkbaovw 2022-11-23T02:05:08.0706849Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpazkbaovw/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0706872Z 2022-11-23T02:05:08.0706984Z Running tests... 2022-11-23T02:05:08.0707255Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0707570Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0707780Z test_forward_backward_unused_parameters (__main__.ReducerTest) ... ok (0.012s) 2022-11-23T02:05:08.0707800Z 2022-11-23T02:05:08.0708041Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0708471Z Ran 1 test in 0.022s 2022-11-23T02:05:08.0708492Z 2022-11-23T02:05:08.0708588Z OK 2022-11-23T02:05:08.0708607Z 2022-11-23T02:05:08.0708733Z Generating XML reports... 2022-11-23T02:05:08.0709128Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020451.xml 2022-11-23T02:05:08.0709580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0709757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0710134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0710305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0710559Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7dtzj8d3 2022-11-23T02:05:08.0710826Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7dtzj8d3/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0710847Z 2022-11-23T02:05:08.0710960Z Running tests... 2022-11-23T02:05:08.0711231Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0711539Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0711733Z test_multi_dtype_multi_bucket (__main__.ReducerTest) ... ok (0.006s) 2022-11-23T02:05:08.0711756Z 2022-11-23T02:05:08.0712023Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0712142Z Ran 1 test in 0.012s 2022-11-23T02:05:08.0712161Z 2022-11-23T02:05:08.0712234Z OK 2022-11-23T02:05:08.0712254Z 2022-11-23T02:05:08.0712380Z Generating XML reports... 2022-11-23T02:05:08.0712772Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020454.xml 2022-11-23T02:05:08.0713138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0713315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0713694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0713885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0714148Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn1rnfo7k 2022-11-23T02:05:08.0714403Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn1rnfo7k/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0714441Z 2022-11-23T02:05:08.0714530Z Running tests... 2022-11-23T02:05:08.0714792Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0715103Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0715296Z test_multi_dtype_single_bucket (__main__.ReducerTest) ... ok (0.009s) 2022-11-23T02:05:08.0715316Z 2022-11-23T02:05:08.0715572Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0715686Z Ran 1 test in 0.012s 2022-11-23T02:05:08.0715706Z 2022-11-23T02:05:08.0715797Z OK 2022-11-23T02:05:08.0715817Z 2022-11-23T02:05:08.0715941Z Generating XML reports... 2022-11-23T02:05:08.0716364Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020456.xml 2022-11-23T02:05:08.0716747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0716922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0717303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0717501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0717762Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgwgun13i 2022-11-23T02:05:08.0718033Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgwgun13i/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0718052Z 2022-11-23T02:05:08.0718159Z Running tests... 2022-11-23T02:05:08.0718403Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0718765Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0718957Z test_single_dtype_single_bucket (__main__.ReducerTest) ... ok (0.006s) 2022-11-23T02:05:08.0718976Z 2022-11-23T02:05:08.0719234Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0719346Z Ran 1 test in 0.012s 2022-11-23T02:05:08.0719365Z 2022-11-23T02:05:08.0719457Z OK 2022-11-23T02:05:08.0719476Z 2022-11-23T02:05:08.0719600Z Generating XML reports... 2022-11-23T02:05:08.0719989Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020458.xml 2022-11-23T02:05:08.0720357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0720516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0720893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0721092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0721348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz9h0sml4 2022-11-23T02:05:08.0721617Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz9h0sml4/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0721637Z 2022-11-23T02:05:08.0721746Z Running tests... 2022-11-23T02:05:08.0722006Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0722313Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0722526Z test_logging_init (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0722773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:05:08.0723176Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:05:08.0723286Z ok (1.758s) 2022-11-23T02:05:08.0723306Z 2022-11-23T02:05:08.0723569Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0723686Z Ran 1 test in 1.759s 2022-11-23T02:05:08.0723705Z 2022-11-23T02:05:08.0723801Z OK 2022-11-23T02:05:08.0723820Z 2022-11-23T02:05:08.0723942Z Generating XML reports... 2022-11-23T02:05:08.0724356Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123020501.xml 2022-11-23T02:05:08.0724706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:05:08.0724884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:05:08.0725263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:05:08.0725458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:05:08.0725758Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp981tk1x0 2022-11-23T02:05:08.0726030Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp981tk1x0/_remote_module_non_scriptable.py 2022-11-23T02:05:08.0726050Z 2022-11-23T02:05:08.0726159Z Running tests... 2022-11-23T02:05:08.0726424Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0726714Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:05:08.0726950Z test_default_store_timeout_gloo (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:05:08.0727699Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/74714 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.722s) 2022-11-23T02:05:08.0727764Z 2022-11-23T02:05:08.0728037Z ---------------------------------------------------------------------- 2022-11-23T02:05:08.0728147Z Ran 1 test in 1.722s 2022-11-23T02:05:08.0728166Z 2022-11-23T02:05:08.0728274Z OK (skipped=1) 2022-11-23T02:05:08.0728296Z 2022-11-23T02:05:08.0728420Z Generating XML reports... 2022-11-23T02:05:08.0728810Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123020505.xml 2022-11-23T02:05:08.0728830Z 2022-11-23T02:05:08.0729326Z ##[endgroup] 2022-11-23T02:05:08.0729739Z FINISHED PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_or6yn0rj) 2022-11-23T02:05:08.0729778Z 2022-11-23T02:05:08.3061505Z 2022-11-23T02:05:08.3062516Z real 14m40.624s 2022-11-23T02:05:08.3062749Z user 29m53.276s 2022-11-23T02:05:08.3062862Z sys 23m38.441s 2022-11-23T02:05:08.3063295Z + python test/run_test.py --verbose -i distributed/test_c10d_nccl 2022-11-23T02:05:10.6602862Z Ignoring disabled issues: [] 2022-11-23T02:05:10.7131776Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:05:10.7132520Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:05:10.7132853Z Selected tests: 2022-11-23T02:05:10.7133122Z distributed/test_c10d_nccl 2022-11-23T02:05:10.7165693Z Prioritized test from test file changes. 2022-11-23T02:05:10.7166041Z reordering tests for PR: 2022-11-23T02:05:10.7166315Z prioritized: [] 2022-11-23T02:05:10.7166785Z the rest: ['distributed/test_c10d_nccl'] 2022-11-23T02:05:10.7166977Z 2022-11-23T02:05:10.7167518Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:05:10.7168443Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:05:10.7175399Z parallel (file granularity) tests: 2022-11-23T02:05:10.7175848Z 2022-11-23T02:05:10.7176183Z serial (file granularity) tests: 2022-11-23T02:05:10.7176476Z distributed/test_c10d_nccl 2022-11-23T02:05:13.0514698Z Ignoring disabled issues: [] 2022-11-23T02:05:13.4605783Z Running distributed/test_c10d_nccl ... [2022-11-23 02:05:13.460012] 2022-11-23T02:05:13.4606812Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:05:13.460495] 2022-11-23T02:25:21.8164536Z 2022-11-23T02:25:21.8165290Z Expand the folded group to see the log file of distributed/test_c10d_nccl 2022-11-23T02:25:21.8166381Z ##[group]PRINTING LOG FILE of distributed/test_c10d_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_nccl_vtevv4rw) 2022-11-23T02:25:21.8169138Z , <__main__.CommTest testMethod=test_broadcast_coalesced_nccl>, <__main__.CommTest testMethod=test_nccl_barrier>, <__main__.CommTest testMethod=test_nccl_barrier_device_ids>, <__main__.CommTest testMethod=test_nccl_barrier_device_ids_function_argument>, <__main__.CommTest testMethod=test_nccl_barrier_timeout>, <__main__.CommTest testMethod=test_nccl_barrier_timeout_new_group>, <__main__.CommTest testMethod=test_nccl_barrier_timeout_new_group_non_member>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_detail>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_info>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_off>, <__main__.CommTest testMethod=test_nncl_rank_membership>, <__main__.CommTest testMethod=test_pass_nccl_options_high_priority_stream>, <__main__.CommTest testMethod=test_sequence_num_incremented_nccl_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_nccl_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_nccl>, <__main__.CommTest testMethod=test_sequence_num_set_nccl_new_group>, <__main__.CommTest testMethod=test_tensor_dtype_complex>, <__main__.CommTest testMethod=test_tensor_dtype_mismatch>]> 2022-11-23T02:25:21.8171250Z test_all_reduce_coalesced_nccl (__main__.CommTest) 2022-11-23T02:25:21.8171610Z test_broadcast_coalesced_nccl (__main__.CommTest) 2022-11-23T02:25:21.8171985Z test_nccl_barrier (__main__.CommTest) 2022-11-23T02:25:21.8172235Z test_nccl_barrier_device_ids (__main__.CommTest) 2022-11-23T02:25:21.8172604Z test_nccl_barrier_device_ids_function_argument (__main__.CommTest) 2022-11-23T02:25:21.8172946Z test_nccl_barrier_timeout (__main__.CommTest) 2022-11-23T02:25:21.8173291Z test_nccl_barrier_timeout_new_group (__main__.CommTest) 2022-11-23T02:25:21.8174811Z test_nccl_barrier_timeout_new_group_non_member (__main__.CommTest) 2022-11-23T02:25:21.8175366Z test_nccl_warn_not_in_group_debug_detail (__main__.CommTest) 2022-11-23T02:25:21.8175801Z test_nccl_warn_not_in_group_debug_info (__main__.CommTest) 2022-11-23T02:25:21.8176289Z test_nccl_warn_not_in_group_debug_off (__main__.CommTest) 2022-11-23T02:25:21.8176839Z test_nncl_rank_membership (__main__.CommTest) 2022-11-23T02:25:21.8177340Z test_pass_nccl_options_high_priority_stream (__main__.CommTest) 2022-11-23T02:25:21.8177840Z test_sequence_num_incremented_nccl_default (__main__.CommTest) 2022-11-23T02:25:21.8178446Z test_sequence_num_incremented_nccl_subgroup (__main__.CommTest) 2022-11-23T02:25:21.8179106Z test_sequence_num_set_default_pg_nccl (__main__.CommTest) 2022-11-23T02:25:21.8179771Z test_sequence_num_set_nccl_new_group (__main__.CommTest) 2022-11-23T02:25:21.8180564Z test_tensor_dtype_complex (__main__.CommTest) 2022-11-23T02:25:21.8181008Z test_tensor_dtype_mismatch (__main__.CommTest) 2022-11-23T02:25:21.8182510Z , <__main__.CompilerTest testMethod=test_allreduce_work_wait_gpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_gpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_gpu>, <__main__.CompilerTest testMethod=test_nested_comm_tensor_wrapping>, <__main__.CompilerTest testMethod=test_reduce_scatter_work_wait_gpu>, <__main__.CompilerTest testMethod=test_scatter_work_wait_gpu>]> 2022-11-23T02:25:21.8184881Z test_allgather_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8185542Z test_allreduce_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8186179Z test_broadcast_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8186572Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8187048Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) 2022-11-23T02:25:21.8187372Z test_reduce_scatter_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8187771Z test_scatter_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:25:21.8200504Z , <__main__.DistributedDataParallelTest testMethod=test_accumulate_gradients_module_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_arbitrary_forward_return_value>, <__main__.DistributedDataParallelTest testMethod=test_arbitrary_forward_return_value_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_bf16_compress_wrapper_is_view>, <__main__.DistributedDataParallelTest testMethod=test_bf16_compress_wrapper_nccl>, <__main__.DistributedDataParallelTest testMethod=test_builtin_ddp_comm_hooks_nccl>, <__main__.DistributedDataParallelTest testMethod=test_builtin_ddp_comm_hooks_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_channels_last_contig>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_module>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_with_then_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_multi_device_module_config>, <__main__.DistributedDataParallelTest testMethod=test_ddp_packed_sequence>, <__main__.DistributedDataParallelTest testMethod=test_ddp_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_with_lazy_parameters>, <__main__.DistributedDataParallelTest testMethod=test_default_ddp_comm_hooks_nccl>, <__main__.DistributedDataParallelTest testMethod=test_default_ddp_comm_hooks_nccl_is_view>, <__main__.DistributedDataParallelTest testMethod=test_failure_recovery>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_detail>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_info>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_off>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_detail>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_info>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_off>, <__main__.DistributedDataParallelTest testMethod=test_fp16>, <__main__.DistributedDataParallelTest testMethod=test_fp16_compress_wrapper_is_view>, <__main__.DistributedDataParallelTest testMethod=test_fp16_compress_wrapper_nccl>, <__main__.DistributedDataParallelTest testMethod=test_fp16_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_grad_layout_1devicemodule_1replicaperprocess>, <__main__.DistributedDataParallelTest testMethod=test_grad_layout_2devicemodule>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_multiple_outputs_multiple_backward>, <__main__.DistributedDataParallelTest testMethod=test_multiple_outputs_multiple_backward_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_multi_device_ids_not_allowed>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_multi_device_module_device_ids_None>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_single_device_module_device_ids_None>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_single_device_module_empty_device_ids>, <__main__.DistributedDataParallelTest testMethod=test_nccl_propagate_error_reason>, <__main__.DistributedDataParallelTest testMethod=test_no_grad>, <__main__.DistributedDataParallelTest testMethod=test_param_layout_mismatch_error>, <__main__.DistributedDataParallelTest testMethod=test_pass_default_pg>, <__main__.DistributedDataParallelTest testMethod=test_powerSGD_ddp_comm_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_powerSGD_ddp_comm_hook_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-11-23T02:25:21.8213133Z test_accumulate_gradients_module (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8214019Z test_accumulate_gradients_module_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8214878Z test_arbitrary_forward_return_value (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8215764Z test_arbitrary_forward_return_value_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8216416Z test_bf16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8216979Z test_bf16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8217588Z test_builtin_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8218254Z test_builtin_ddp_comm_hooks_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8218818Z test_channels_last_contig (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8219234Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8219877Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8220594Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8221345Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8222186Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8222897Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8223420Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8224614Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8225181Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8225654Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8226124Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8226718Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8227328Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8227814Z test_ddp_comm_hook_allreduce_hook_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8229150Z test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8229563Z test_ddp_comm_hook_allreduce_hook_nccl_static_graph (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8230122Z test_ddp_comm_hook_allreduce_with_then_hook_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8230598Z test_ddp_comm_hook_future_passing_gpu_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8231068Z test_ddp_multi_device_module_config (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8231561Z test_ddp_packed_sequence (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8232025Z test_ddp_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8232463Z test_ddp_with_lazy_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8232900Z test_default_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8233337Z test_default_ddp_comm_hooks_nccl_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8233777Z test_failure_recovery (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8234245Z test_find_unused_parameters_kwarg_debug_detail (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8234728Z test_find_unused_parameters_kwarg_debug_info (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8235181Z test_find_unused_parameters_kwarg_debug_off (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8235681Z test_find_unused_parameters_kwarg_grad_is_view_debug_detail (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8236213Z test_find_unused_parameters_kwarg_grad_is_view_debug_info (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8236754Z test_find_unused_parameters_kwarg_grad_is_view_debug_off (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8237143Z test_fp16 (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8237592Z test_fp16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8238038Z test_fp16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8238402Z test_fp16_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8238883Z test_grad_layout_1devicemodule_1replicaperprocess (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8239312Z test_grad_layout_2devicemodule (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8239803Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8240257Z test_multiple_outputs_multiple_backward (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8240660Z test_multiple_outputs_multiple_backward_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8241239Z test_nccl_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8241693Z test_nccl_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8242219Z test_nccl_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8242629Z test_nccl_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8243116Z test_nccl_backend_multi_device_ids_not_allowed (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8243505Z test_nccl_backend_multi_device_module_device_ids_None (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8244070Z test_nccl_backend_single_device_module_device_ids_None (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8244603Z test_nccl_backend_single_device_module_empty_device_ids (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8245081Z test_nccl_propagate_error_reason (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8245468Z test_no_grad (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8245956Z test_param_layout_mismatch_error (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8246391Z test_pass_default_pg (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8246773Z test_powerSGD_ddp_comm_hook_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8247264Z test_powerSGD_ddp_comm_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8247695Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8248129Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8248543Z 2022-11-23T02:25:21.8249811Z , <__main__.NcclErrorHandlingTest testMethod=test_nccl_blocking_wait_with_barrier>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_abort>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_clean_exit>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_nonzero_exit>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_sigkill>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_sigterm>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_nonblocking>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_timeout>]> 2022-11-23T02:25:21.8251098Z test_invalid_nccl_blocking_wait_env (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8251489Z test_nccl_blocking_wait_with_barrier (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8251867Z test_nccl_errors_blocking_abort (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8252271Z test_nccl_errors_blocking_clean_exit (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8252729Z test_nccl_errors_blocking_nonzero_exit (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8253107Z test_nccl_errors_blocking_sigkill (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8253447Z test_nccl_errors_blocking_sigterm (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8253923Z test_nccl_errors_nonblocking (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8254222Z test_nccl_timeout (__main__.NcclErrorHandlingTest) 2022-11-23T02:25:21.8255244Z , <__main__.NcclProcessGroupWithDispatchedCollectivesTests testMethod=test_allreduce_coalesced>, <__main__.NcclProcessGroupWithDispatchedCollectivesTests testMethod=test_collectives>, <__main__.NcclProcessGroupWithDispatchedCollectivesTests testMethod=test_reduce_scatter_base>]> 2022-11-23T02:25:21.8256395Z test_allgather_base (__main__.NcclProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:25:21.8256947Z test_allreduce_coalesced (__main__.NcclProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:25:21.8257448Z test_collectives (__main__.NcclProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:25:21.8258009Z test_reduce_scatter_base (__main__.NcclProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:25:21.8283098Z ]> 2022-11-23T02:25:21.8283742Z test_init_no_gpus (__main__.ProcessGroupNCCLNoGPUTest) 2022-11-23T02:25:21.8285831Z , <__main__.ProcessGroupNCCLTest testMethod=test_allgather_base_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_allgather_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_allreduce_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_barrier>, <__main__.ProcessGroupNCCLTest testMethod=test_broadcast_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_empty_tensors>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_checks>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_stress>, <__main__.ProcessGroupNCCLTest testMethod=test_nccl_dist_backend_error>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_base_basics>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_base_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_checks>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_stress>, <__main__.ProcessGroupNCCLTest testMethod=test_send_recv>]> 2022-11-23T02:25:21.8288015Z test_allgather_base_basics (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8288390Z test_allgather_base_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8288853Z test_allgather_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8289238Z test_allreduce_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8289653Z test_barrier (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8290040Z test_broadcast_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8290321Z test_empty_tensors (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8290699Z test_gather_checks (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8291128Z test_gather_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8291497Z test_gather_stress (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8291851Z test_nccl_dist_backend_error (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8292167Z test_reduce_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8292618Z test_reduce_scatter_base_basics (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8292990Z test_reduce_scatter_base_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8293409Z test_reduce_scatter_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8293824Z test_scatter_checks (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8294226Z test_scatter_ops (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8294561Z test_scatter_stress (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8294925Z test_send_recv (__main__.ProcessGroupNCCLTest) 2022-11-23T02:25:21.8295374Z ]> 2022-11-23T02:25:21.8295794Z test_common_errors (__main__.RendezvousEnvTest) 2022-11-23T02:25:21.8296073Z 2022-11-23T02:25:21.8296562Z ]> 2022-11-23T02:25:21.8296909Z test_default_store_timeout_nccl (__main__.TimeoutTest) 2022-11-23T02:25:21.8297737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8298174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8298757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8299297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8299533Z 2022-11-23T02:25:21.8299665Z Running tests... 2022-11-23T02:25:21.8300087Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8300626Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8301126Z test_all_reduce_coalesced_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8301545Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34222 2022-11-23T02:25:21.8302083Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34223 2022-11-23T02:25:21.8302609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8303129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8303736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8304974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8305514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8306019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8306634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8307096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8307553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8308040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8309069Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8309752Z warnings.warn( 2022-11-23T02:25:21.8310627Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8311297Z warnings.warn( 2022-11-23T02:25:21.8311559Z ok (6.802s) 2022-11-23T02:25:21.8311716Z 2022-11-23T02:25:21.8311971Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8312227Z Ran 1 test in 6.802s 2022-11-23T02:25:21.8312416Z 2022-11-23T02:25:21.8312592Z OK 2022-11-23T02:25:21.8312706Z 2022-11-23T02:25:21.8312822Z Generating XML reports... 2022-11-23T02:25:21.8313296Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020517.xml 2022-11-23T02:25:21.8314068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8314455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8315005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8315592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8315734Z 2022-11-23T02:25:21.8315847Z Running tests... 2022-11-23T02:25:21.8316261Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8316781Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8317282Z test_broadcast_coalesced_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8317756Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34435 2022-11-23T02:25:21.8318199Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34436 2022-11-23T02:25:21.8318819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8319277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8319863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8320410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8320917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8321440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8322163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8322660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8323158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8323623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8323937Z ok (6.807s) 2022-11-23T02:25:21.8324094Z 2022-11-23T02:25:21.8324373Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8324751Z Ran 1 test in 6.807s 2022-11-23T02:25:21.8324921Z 2022-11-23T02:25:21.8324994Z OK 2022-11-23T02:25:21.8325200Z 2022-11-23T02:25:21.8325322Z Generating XML reports... 2022-11-23T02:25:21.8325819Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020526.xml 2022-11-23T02:25:21.8326557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8326996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8327578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8328122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8328324Z 2022-11-23T02:25:21.8328415Z Running tests... 2022-11-23T02:25:21.8328846Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8329443Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8329920Z test_nccl_barrier (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8330352Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34648 2022-11-23T02:25:21.8330825Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34649 2022-11-23T02:25:21.8331442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8331877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8332461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8332964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8333535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8333975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8334652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8335123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8335609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8336071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8336619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8337093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8337775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8338489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8339027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8339610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8340270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8340976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8341507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:25:21.8342034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:25:21.8342674Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:25:21.8343389Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:25:21.8344821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:25:21.8345415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:25:21.8346061Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:25:21.8346839Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:25:21.8347259Z ok (7.112s) 2022-11-23T02:25:21.8347412Z 2022-11-23T02:25:21.8347689Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8348005Z Ran 1 test in 7.112s 2022-11-23T02:25:21.8348174Z 2022-11-23T02:25:21.8348280Z OK 2022-11-23T02:25:21.8348422Z 2022-11-23T02:25:21.8348557Z Generating XML reports... 2022-11-23T02:25:21.8349089Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020536.xml 2022-11-23T02:25:21.8349763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8350246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8350854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8351298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8351548Z 2022-11-23T02:25:21.8351663Z Running tests... 2022-11-23T02:25:21.8352087Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8352603Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8353090Z test_nccl_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8353567Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34884 2022-11-23T02:25:21.8354041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34885 2022-11-23T02:25:21.8354634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8355104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8355696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8356154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8356738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8357200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8357782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8358243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8358689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8359207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8359702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8360167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8360845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8361622Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8361924Z ok (5.706s) 2022-11-23T02:25:21.8362077Z 2022-11-23T02:25:21.8362351Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8362771Z Ran 1 test in 5.706s 2022-11-23T02:25:21.8362941Z 2022-11-23T02:25:21.8363038Z OK 2022-11-23T02:25:21.8363152Z 2022-11-23T02:25:21.8363282Z Generating XML reports... 2022-11-23T02:25:21.8363841Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020545.xml 2022-11-23T02:25:21.8364518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8364953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8365534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8366008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8366243Z 2022-11-23T02:25:21.8366449Z Running tests... 2022-11-23T02:25:21.8366754Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8367303Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8367851Z test_nccl_barrier_device_ids_function_argument (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8368291Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35096 2022-11-23T02:25:21.8368853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35097 2022-11-23T02:25:21.8369375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8369835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8370395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8370877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8371471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8371926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8372575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8373036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8373494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8373874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8374367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8374865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8375529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8376335Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8376760Z ok (4.091s) 2022-11-23T02:25:21.8376914Z 2022-11-23T02:25:21.8377191Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8377514Z Ran 1 test in 4.091s 2022-11-23T02:25:21.8377681Z 2022-11-23T02:25:21.8377780Z OK 2022-11-23T02:25:21.8377924Z 2022-11-23T02:25:21.8378054Z Generating XML reports... 2022-11-23T02:25:21.8378605Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020553.xml 2022-11-23T02:25:21.8379250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8379709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8380289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8380811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8381056Z 2022-11-23T02:25:21.8381172Z Running tests... 2022-11-23T02:25:21.8381591Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8382131Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8382587Z test_nccl_barrier_timeout (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8383058Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35299 2022-11-23T02:25:21.8383517Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35300 2022-11-23T02:25:21.8384420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8384797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8385396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8385887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8386452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8386912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8387494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8387971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8388395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8388876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8389232Z ok (14.149s) 2022-11-23T02:25:21.8389385Z 2022-11-23T02:25:21.8389717Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8389998Z Ran 1 test in 14.149s 2022-11-23T02:25:21.8390168Z 2022-11-23T02:25:21.8390272Z OK 2022-11-23T02:25:21.8390412Z 2022-11-23T02:25:21.8390517Z Generating XML reports... 2022-11-23T02:25:21.8391070Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020600.xml 2022-11-23T02:25:21.8391741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8392198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8392754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8393234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8393470Z 2022-11-23T02:25:21.8393582Z Running tests... 2022-11-23T02:25:21.8394076Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8394614Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8395203Z test_nccl_barrier_timeout_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8395692Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35502 2022-11-23T02:25:21.8396130Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35503 2022-11-23T02:25:21.8396756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8397213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8397794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8398320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8398915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8399366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8399920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8400396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8400841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8401316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8401793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8402302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8402973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8403678Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8404204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:25:21.8404572Z ok (9.118s) 2022-11-23T02:25:21.8404725Z 2022-11-23T02:25:21.8405000Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8405316Z Ran 1 test in 9.119s 2022-11-23T02:25:21.8405481Z 2022-11-23T02:25:21.8405579Z OK 2022-11-23T02:25:21.8405718Z 2022-11-23T02:25:21.8405846Z Generating XML reports... 2022-11-23T02:25:21.8406372Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020616.xml 2022-11-23T02:25:21.8407045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8407593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8408092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8408548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8408784Z 2022-11-23T02:25:21.8408897Z Running tests... 2022-11-23T02:25:21.8409311Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8409852Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8410338Z test_nccl_barrier_timeout_new_group_non_member (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8410829Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35726 2022-11-23T02:25:21.8411297Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35727 2022-11-23T02:25:21.8411940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8412422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8413008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8413492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8414090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8414511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8415088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8415536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8416064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8416550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8417059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8417549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8418220Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8418917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8419458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:25:21.8419800Z ok (9.013s) 2022-11-23T02:25:21.8419961Z 2022-11-23T02:25:21.8420236Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8420581Z Ran 1 test in 9.013s 2022-11-23T02:25:21.8420749Z 2022-11-23T02:25:21.8420822Z OK 2022-11-23T02:25:21.8420960Z 2022-11-23T02:25:21.8421090Z Generating XML reports... 2022-11-23T02:25:21.8421730Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020628.xml 2022-11-23T02:25:21.8422410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8422850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8423434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8424291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8424542Z 2022-11-23T02:25:21.8424620Z Running tests... 2022-11-23T02:25:21.8425052Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8425584Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8426015Z test_nccl_warn_not_in_group_debug_detail (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8426473Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35941 2022-11-23T02:25:21.8426934Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35942 2022-11-23T02:25:21.8427554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8427991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8428575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8429054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8429720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8430167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8430752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8431231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8431661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8432140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8432630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8433138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8433860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8434407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8435061Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8435601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8436232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8436923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8437327Z ok (5.799s) 2022-11-23T02:25:21.8437483Z 2022-11-23T02:25:21.8437759Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8438079Z Ran 1 test in 5.799s 2022-11-23T02:25:21.8438245Z 2022-11-23T02:25:21.8438343Z OK 2022-11-23T02:25:21.8438485Z 2022-11-23T02:25:21.8438615Z Generating XML reports... 2022-11-23T02:25:21.8439143Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020640.xml 2022-11-23T02:25:21.8439818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8440279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8440866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8441326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8441559Z 2022-11-23T02:25:21.8441672Z Running tests... 2022-11-23T02:25:21.8442082Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8442604Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8443105Z test_nccl_warn_not_in_group_debug_info (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8443649Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36168 2022-11-23T02:25:21.8444049Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36169 2022-11-23T02:25:21.8444640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8445098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8445684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8446139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8446730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8447232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8447913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8448278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8448731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8449229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8449699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8450187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8450850Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8451468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8452105Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8452640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8453294Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8453987Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8454365Z ok (5.806s) 2022-11-23T02:25:21.8454518Z 2022-11-23T02:25:21.8454792Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8455131Z Ran 1 test in 5.807s 2022-11-23T02:25:21.8455299Z 2022-11-23T02:25:21.8455372Z OK 2022-11-23T02:25:21.8455521Z 2022-11-23T02:25:21.8455650Z Generating XML reports... 2022-11-23T02:25:21.8456204Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020648.xml 2022-11-23T02:25:21.8456879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8457318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8457901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8458380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8458614Z 2022-11-23T02:25:21.8458734Z Running tests... 2022-11-23T02:25:21.8459124Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8459666Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8460163Z test_nccl_warn_not_in_group_debug_off (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8460621Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36386 2022-11-23T02:25:21.8461086Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36387 2022-11-23T02:25:21.8461701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8462169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8462732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8463208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8463793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8464630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8465327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8465820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8466186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8466658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8467158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8467656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8468300Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8468848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8469585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8470129Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8470760Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8471459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8471866Z ok (5.825s) 2022-11-23T02:25:21.8472019Z 2022-11-23T02:25:21.8472289Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8472606Z Ran 1 test in 5.825s 2022-11-23T02:25:21.8472857Z 2022-11-23T02:25:21.8472868Z OK 2022-11-23T02:25:21.8473005Z 2022-11-23T02:25:21.8473137Z Generating XML reports... 2022-11-23T02:25:21.8473668Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020656.xml 2022-11-23T02:25:21.8474351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8474818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8475408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8475863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8476097Z 2022-11-23T02:25:21.8476211Z Running tests... 2022-11-23T02:25:21.8476625Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8477144Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8477619Z test_nncl_rank_membership (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8478095Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36604 2022-11-23T02:25:21.8478566Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36605 2022-11-23T02:25:21.8479162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8479624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8480208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8480665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8481249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8481703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8482278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8482779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8483242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8483745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8484244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8484711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8485376Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8486078Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8486602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8487166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8487827Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8488519Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8488898Z ok (4.035s) 2022-11-23T02:25:21.8489054Z 2022-11-23T02:25:21.8489326Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8489667Z Ran 1 test in 4.035s 2022-11-23T02:25:21.8489831Z 2022-11-23T02:25:21.8489927Z OK 2022-11-23T02:25:21.8490055Z 2022-11-23T02:25:21.8490232Z Generating XML reports... 2022-11-23T02:25:21.8490731Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020704.xml 2022-11-23T02:25:21.8491415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8491852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8492438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8492921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8493158Z 2022-11-23T02:25:21.8493273Z Running tests... 2022-11-23T02:25:21.8493665Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8494204Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8494715Z test_pass_nccl_options_high_priority_stream (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8495178Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36809 2022-11-23T02:25:21.8495645Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36810 2022-11-23T02:25:21.8496261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8496729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8497294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8497771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8498356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8498788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8499363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8499834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8500337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8500824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8501318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8501819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8502491Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8503010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8503660Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8504488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8505135Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8505836Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8506248Z ok (6.797s) 2022-11-23T02:25:21.8506401Z 2022-11-23T02:25:21.8506673Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8506996Z Ran 1 test in 6.798s 2022-11-23T02:25:21.8507164Z 2022-11-23T02:25:21.8507260Z OK 2022-11-23T02:25:21.8507397Z 2022-11-23T02:25:21.8507527Z Generating XML reports... 2022-11-23T02:25:21.8508053Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020711.xml 2022-11-23T02:25:21.8508727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8509188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8509773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8510230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8510465Z 2022-11-23T02:25:21.8510581Z Running tests... 2022-11-23T02:25:21.8510992Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8511527Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8512021Z test_sequence_num_incremented_nccl_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8512516Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37026 2022-11-23T02:25:21.8512973Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37027 2022-11-23T02:25:21.8513619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8514032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8514618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8515097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8515658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8516118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8516698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8517153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8517600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8518170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8518680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8519146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8519902Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8520518Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8521062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8521617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8522376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8523076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8523455Z ok (5.705s) 2022-11-23T02:25:21.8523613Z 2022-11-23T02:25:21.8523890Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8524226Z Ran 1 test in 5.705s 2022-11-23T02:25:21.8524393Z 2022-11-23T02:25:21.8524491Z OK 2022-11-23T02:25:21.8524605Z 2022-11-23T02:25:21.8524735Z Generating XML reports... 2022-11-23T02:25:21.8525286Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020720.xml 2022-11-23T02:25:21.8525957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8526397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8526992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8527479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8527718Z 2022-11-23T02:25:21.8527839Z Running tests... 2022-11-23T02:25:21.8528230Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8528772Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8529281Z test_sequence_num_incremented_nccl_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8529783Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37244 2022-11-23T02:25:21.8530207Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37245 2022-11-23T02:25:21.8530803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8531259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8531820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8532300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8532885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8533336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8533890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8534447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8534798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8535260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8535614Z ok (4.087s) 2022-11-23T02:25:21.8535822Z 2022-11-23T02:25:21.8536109Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8536454Z Ran 1 test in 4.087s 2022-11-23T02:25:21.8536597Z 2022-11-23T02:25:21.8536697Z OK 2022-11-23T02:25:21.8536837Z 2022-11-23T02:25:21.8536964Z Generating XML reports... 2022-11-23T02:25:21.8537507Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020728.xml 2022-11-23T02:25:21.8538146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8538593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8539170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8539646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8539938Z 2022-11-23T02:25:21.8540062Z Running tests... 2022-11-23T02:25:21.8540576Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8541031Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8541509Z test_sequence_num_set_default_pg_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8541991Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37443 2022-11-23T02:25:21.8542528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37444 2022-11-23T02:25:21.8543055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8543555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8544409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8544911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8545427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8545964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8546466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8546909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8547332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8547827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8548324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8548796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8549460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8550157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8550557Z ok (5.711s) 2022-11-23T02:25:21.8550710Z 2022-11-23T02:25:21.8551077Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8551419Z Ran 1 test in 5.711s 2022-11-23T02:25:21.8551586Z 2022-11-23T02:25:21.8551684Z OK 2022-11-23T02:25:21.8551821Z 2022-11-23T02:25:21.8551948Z Generating XML reports... 2022-11-23T02:25:21.8552474Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020735.xml 2022-11-23T02:25:21.8553143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8553681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8554258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8554742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8555050Z 2022-11-23T02:25:21.8555096Z Running tests... 2022-11-23T02:25:21.8555515Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8556029Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8556521Z test_sequence_num_set_nccl_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8556999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37655 2022-11-23T02:25:21.8557435Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37656 2022-11-23T02:25:21.8558136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8558599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8559180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8559635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8560221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8560675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8561234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8561712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8562160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8562666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8563134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8563628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8564424Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8564968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:21.8565599Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8566135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:21.8566803Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8567516Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:21.8567925Z ok (5.690s) 2022-11-23T02:25:21.8568080Z 2022-11-23T02:25:21.8568353Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8568695Z Ran 1 test in 5.690s 2022-11-23T02:25:21.8568837Z 2022-11-23T02:25:21.8568935Z OK 2022-11-23T02:25:21.8569079Z 2022-11-23T02:25:21.8569210Z Generating XML reports... 2022-11-23T02:25:21.8569756Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020743.xml 2022-11-23T02:25:21.8570403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8570865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8571511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8572005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8572241Z 2022-11-23T02:25:21.8572330Z Running tests... 2022-11-23T02:25:21.8572751Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8573288Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8573746Z test_tensor_dtype_complex (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8574219Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37871 2022-11-23T02:25:21.8574680Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37872 2022-11-23T02:25:21.8575297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8575795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8576383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8576922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8577462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8577889Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8578473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8578942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8579361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8579860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8580354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8580852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8581492Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8582184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8582593Z ok (6.801s) 2022-11-23T02:25:21.8582753Z 2022-11-23T02:25:21.8583029Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8583349Z Ran 1 test in 6.801s 2022-11-23T02:25:21.8583515Z 2022-11-23T02:25:21.8583613Z OK 2022-11-23T02:25:21.8583753Z 2022-11-23T02:25:21.8584102Z Generating XML reports... 2022-11-23T02:25:21.8584668Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020751.xml 2022-11-23T02:25:21.8585427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8585872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8586459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8586931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8587127Z 2022-11-23T02:25:21.8587275Z Running tests... 2022-11-23T02:25:21.8587614Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8588128Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8588612Z test_tensor_dtype_mismatch (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8589085Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38084 2022-11-23T02:25:21.8589592Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38085 2022-11-23T02:25:21.8590229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8590785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8591273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8591726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8592314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8592769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8593343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8593880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8594328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8594836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8595308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8595805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8596475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8597170Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8598217Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8598882Z warnings.warn( 2022-11-23T02:25:21.8599764Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8600385Z warnings.warn( 2022-11-23T02:25:21.8601238Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8601854Z warnings.warn( 2022-11-23T02:25:21.8602746Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.8603372Z warnings.warn( 2022-11-23T02:25:21.8603602Z ok (5.653s) 2022-11-23T02:25:21.8603752Z 2022-11-23T02:25:21.8604027Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8604366Z Ran 1 test in 5.654s 2022-11-23T02:25:21.8604531Z 2022-11-23T02:25:21.8604630Z OK 2022-11-23T02:25:21.8604743Z 2022-11-23T02:25:21.8604875Z Generating XML reports... 2022-11-23T02:25:21.8605426Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020801.xml 2022-11-23T02:25:21.8606098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8606537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8607173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8607662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8607898Z 2022-11-23T02:25:21.8608011Z Running tests... 2022-11-23T02:25:21.8608404Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8608946Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8609448Z test_allgather_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8609908Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38289 2022-11-23T02:25:21.8610368Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38290 2022-11-23T02:25:21.8610978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8611505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8612067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8612555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8613143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8613595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8614152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8614627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8615076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8615536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8616035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8616540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8617257Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8617883Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8618818Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8619557Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8620421Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8621122Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8621459Z ok (7.023s) 2022-11-23T02:25:21.8621690Z 2022-11-23T02:25:21.8621989Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8622340Z Ran 1 test in 7.023s 2022-11-23T02:25:21.8622482Z 2022-11-23T02:25:21.8622581Z OK 2022-11-23T02:25:21.8622719Z 2022-11-23T02:25:21.8622848Z Generating XML reports... 2022-11-23T02:25:21.8623419Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020809.xml 2022-11-23T02:25:21.8624540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8625024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8625634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8626108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8626361Z 2022-11-23T02:25:21.8626414Z Running tests... 2022-11-23T02:25:21.8626779Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8627324Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8627798Z test_allreduce_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8628273Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38500 2022-11-23T02:25:21.8628729Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38501 2022-11-23T02:25:21.8629432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8629872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8630460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8630957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8631528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8631953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8632526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8633002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8633433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8633915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8634407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8634918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8635564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8636265Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8637279Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8638007Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8638853Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8639627Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8639976Z ok (7.019s) 2022-11-23T02:25:21.8640120Z 2022-11-23T02:25:21.8640335Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8640660Z Ran 1 test in 7.019s 2022-11-23T02:25:21.8640828Z 2022-11-23T02:25:21.8640926Z OK 2022-11-23T02:25:21.8641060Z 2022-11-23T02:25:21.8641187Z Generating XML reports... 2022-11-23T02:25:21.8641825Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020818.xml 2022-11-23T02:25:21.8642481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8642942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8643532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8643985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8644221Z 2022-11-23T02:25:21.8644335Z Running tests... 2022-11-23T02:25:21.8644746Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8645258Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8645752Z test_broadcast_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8646304Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38711 2022-11-23T02:25:21.8646848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38712 2022-11-23T02:25:21.8647443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8647906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8648490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8648948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8649530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8649986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8650563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8651015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8651452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8651916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8652406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8652887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8653551Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8654252Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8655227Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8655930Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8656790Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8657511Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8657853Z ok (6.912s) 2022-11-23T02:25:21.8657983Z 2022-11-23T02:25:21.8658255Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8658592Z Ran 1 test in 6.912s 2022-11-23T02:25:21.8658759Z 2022-11-23T02:25:21.8658912Z OK 2022-11-23T02:25:21.8658998Z 2022-11-23T02:25:21.8659108Z Generating XML reports... 2022-11-23T02:25:21.8659720Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020828.xml 2022-11-23T02:25:21.8660405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8660868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8661431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8661903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8662137Z 2022-11-23T02:25:21.8662253Z Running tests... 2022-11-23T02:25:21.8662645Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8663180Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8663678Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8664638Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38922 2022-11-23T02:25:21.8665048Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38923 2022-11-23T02:25:21.8665684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8666154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8666619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8667096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8667768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8668125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8668689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8669163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8669616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8670079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8670568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8671073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8671740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8672416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8673359Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8674080Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8674948Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8675669Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8676506Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8677357Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8678185Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8678903Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8679741Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8680449Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8681308Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8682101Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8682963Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8683663Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8684529Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8685249Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8685589Z ok (6.913s) 2022-11-23T02:25:21.8685720Z 2022-11-23T02:25:21.8685998Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8686336Z Ran 1 test in 6.913s 2022-11-23T02:25:21.8686504Z 2022-11-23T02:25:21.8686603Z OK 2022-11-23T02:25:21.8686744Z 2022-11-23T02:25:21.8686849Z Generating XML reports... 2022-11-23T02:25:21.8687417Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020837.xml 2022-11-23T02:25:21.8688100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8688557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8689113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8689597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8689840Z 2022-11-23T02:25:21.8689957Z Running tests... 2022-11-23T02:25:21.8690346Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8690951Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8691388Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8691879Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39133 2022-11-23T02:25:21.8692320Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39134 2022-11-23T02:25:21.8692928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8693388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8693951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8694493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8695172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8695559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8696117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8696591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8697038Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8697520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8697999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8698575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8699240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8699911Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8700847Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8701572Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8702441Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8703174Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8704282Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8705107Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8705943Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8706601Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8706912Z ok (6.914s) 2022-11-23T02:25:21.8707061Z 2022-11-23T02:25:21.8707336Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8707667Z Ran 1 test in 6.914s 2022-11-23T02:25:21.8707908Z 2022-11-23T02:25:21.8707934Z OK 2022-11-23T02:25:21.8708048Z 2022-11-23T02:25:21.8708179Z Generating XML reports... 2022-11-23T02:25:21.8708741Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020847.xml 2022-11-23T02:25:21.8709422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8709857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8710451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8710930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8711169Z 2022-11-23T02:25:21.8711278Z Running tests... 2022-11-23T02:25:21.8711794Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8712292Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8712805Z test_reduce_scatter_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8713272Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39344 2022-11-23T02:25:21.8713732Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39345 2022-11-23T02:25:21.8714346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8714895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8715370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8715946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8716539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8716976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8717556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8718031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8718486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8718945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8719437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8719948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8720630Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8721302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8722338Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8723071Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8723943Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8724737Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8725509Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8726231Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8727096Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8727812Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8728125Z ok (7.029s) 2022-11-23T02:25:21.8728281Z 2022-11-23T02:25:21.8728554Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8728903Z Ran 1 test in 7.030s 2022-11-23T02:25:21.8729146Z 2022-11-23T02:25:21.8729254Z OK 2022-11-23T02:25:21.8729399Z 2022-11-23T02:25:21.8729534Z Generating XML reports... 2022-11-23T02:25:21.8730104Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020856.xml 2022-11-23T02:25:21.8730761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8731272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8731791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8732276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8732558Z 2022-11-23T02:25:21.8732699Z Running tests... 2022-11-23T02:25:21.8733018Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8733630Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8734103Z test_scatter_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8734581Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39555 2022-11-23T02:25:21.8735143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39556 2022-11-23T02:25:21.8735670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8736194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8736705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8737168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8737752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8738187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8738776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8739245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8739672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8740152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8740640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.8741142Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.8741780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8742485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.8743418Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8744518Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8745368Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:25:21.8746010Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:25:21.8746358Z ok (6.997s) 2022-11-23T02:25:21.8746516Z 2022-11-23T02:25:21.8746884Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8747212Z Ran 1 test in 6.997s 2022-11-23T02:25:21.8747374Z 2022-11-23T02:25:21.8747468Z OK 2022-11-23T02:25:21.8747605Z 2022-11-23T02:25:21.8747730Z Generating XML reports... 2022-11-23T02:25:21.8748271Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020905.xml 2022-11-23T02:25:21.8748954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8749401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8749970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8750484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8750720Z 2022-11-23T02:25:21.8750831Z Running tests... 2022-11-23T02:25:21.8751242Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8751757Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8752284Z test_accumulate_gradients_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8752795Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39768 2022-11-23T02:25:21.8753247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39769 2022-11-23T02:25:21.8753838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8754297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8754884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8755370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8755942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8756400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8756979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8757433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8757884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8758370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8758639Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz2if9p_3 2022-11-23T02:25:21.8758915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz2if9p_3/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8759237Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqmersh0k 2022-11-23T02:25:21.8759436Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqmersh0k/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8759673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8759914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8760019Z ok (7.404s) 2022-11-23T02:25:21.8760040Z 2022-11-23T02:25:21.8760317Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8760435Z Ran 1 test in 7.404s 2022-11-23T02:25:21.8760454Z 2022-11-23T02:25:21.8760552Z OK 2022-11-23T02:25:21.8760571Z 2022-11-23T02:25:21.8760704Z Generating XML reports... 2022-11-23T02:25:21.8761147Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020915.xml 2022-11-23T02:25:21.8761580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8761773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8762164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8762363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8762382Z 2022-11-23T02:25:21.8762502Z Running tests... 2022-11-23T02:25:21.8762777Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8763097Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8763385Z test_accumulate_gradients_module_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8763610Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39989 2022-11-23T02:25:21.8763893Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39990 2022-11-23T02:25:21.8764273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8764455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8764841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8765039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8765413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8765593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8765946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8766150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8766387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8766618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8766880Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcgj_n2js 2022-11-23T02:25:21.8767151Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcgj_n2js/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8767410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn9dsguyc 2022-11-23T02:25:21.8767683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn9dsguyc/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8767902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8768143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8768260Z ok (7.526s) 2022-11-23T02:25:21.8768281Z 2022-11-23T02:25:21.8768560Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8768678Z Ran 1 test in 7.527s 2022-11-23T02:25:21.8768697Z 2022-11-23T02:25:21.8768794Z OK 2022-11-23T02:25:21.8768813Z 2022-11-23T02:25:21.8768945Z Generating XML reports... 2022-11-23T02:25:21.8769415Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020925.xml 2022-11-23T02:25:21.8769789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8769945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8770326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8770526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8770546Z 2022-11-23T02:25:21.8770731Z Running tests... 2022-11-23T02:25:21.8771011Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8771331Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8771625Z test_arbitrary_forward_return_value (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8771858Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40210 2022-11-23T02:25:21.8772055Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40211 2022-11-23T02:25:21.8772433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8772613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8772999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8773266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8773642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8773826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8774205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8774401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8774612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8774849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8775116Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb1urgtbu 2022-11-23T02:25:21.8775402Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb1urgtbu/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8775661Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpos_f41za 2022-11-23T02:25:21.8775933Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpos_f41za/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8776172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8776412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8776496Z ok (7.432s) 2022-11-23T02:25:21.8776540Z 2022-11-23T02:25:21.8776793Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8776911Z Ran 1 test in 7.432s 2022-11-23T02:25:21.8776930Z 2022-11-23T02:25:21.8777028Z OK 2022-11-23T02:25:21.8777047Z 2022-11-23T02:25:21.8777179Z Generating XML reports... 2022-11-23T02:25:21.8777740Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020935.xml 2022-11-23T02:25:21.8778034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8778216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8778599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8778773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8778793Z 2022-11-23T02:25:21.8778914Z Running tests... 2022-11-23T02:25:21.8779181Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8779495Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8779799Z test_arbitrary_forward_return_value_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8780075Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40431 2022-11-23T02:25:21.8780311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40432 2022-11-23T02:25:21.8780687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8780843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8781321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8781432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8781800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8781978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8782416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8782614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8782846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8783078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8783316Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz4a6xk2p 2022-11-23T02:25:21.8783588Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz4a6xk2p/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8784179Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_aoqhu7e 2022-11-23T02:25:21.8784591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_aoqhu7e/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8784830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8785085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8785194Z ok (7.398s) 2022-11-23T02:25:21.8785216Z 2022-11-23T02:25:21.8785508Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8785605Z Ran 1 test in 7.398s 2022-11-23T02:25:21.8785628Z 2022-11-23T02:25:21.8785714Z OK 2022-11-23T02:25:21.8785740Z 2022-11-23T02:25:21.8785871Z Generating XML reports... 2022-11-23T02:25:21.8786257Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020945.xml 2022-11-23T02:25:21.8786638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8786821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8787199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8787402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8787423Z 2022-11-23T02:25:21.8787541Z Running tests... 2022-11-23T02:25:21.8787786Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8788100Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8788391Z test_bf16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8788612Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40652 2022-11-23T02:25:21.8788835Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40653 2022-11-23T02:25:21.8789212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8789392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8789855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8790038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8790409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8790592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8790966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8791252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8791450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8791948Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.8792244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8792796Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.8793060Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxi2a8k5q 2022-11-23T02:25:21.8793337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxi2a8k5q/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8793570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsey9jxa0 2022-11-23T02:25:21.8793848Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsey9jxa0/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8793954Z ok (6.900s) 2022-11-23T02:25:21.8793975Z 2022-11-23T02:25:21.8794254Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8794372Z Ran 1 test in 6.900s 2022-11-23T02:25:21.8794391Z 2022-11-23T02:25:21.8794491Z OK 2022-11-23T02:25:21.8794510Z 2022-11-23T02:25:21.8794643Z Generating XML reports... 2022-11-23T02:25:21.8795109Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020955.xml 2022-11-23T02:25:21.8795460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8795643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8796025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8796229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8796249Z 2022-11-23T02:25:21.8796364Z Running tests... 2022-11-23T02:25:21.8796643Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8796950Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8797235Z test_bf16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8797462Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40873 2022-11-23T02:25:21.8797660Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40874 2022-11-23T02:25:21.8798034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8798221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8798654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8798862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8799232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8799416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8799792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8799962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8800207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8800765Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.8801056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8801608Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.8801876Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1auaxxh 2022-11-23T02:25:21.8802158Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1auaxxh/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8802428Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpanwcbj0_ 2022-11-23T02:25:21.8802701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpanwcbj0_/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8802811Z ok (6.923s) 2022-11-23T02:25:21.8802830Z 2022-11-23T02:25:21.8803104Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8803197Z Ran 1 test in 6.923s 2022-11-23T02:25:21.8803216Z 2022-11-23T02:25:21.8803314Z OK 2022-11-23T02:25:21.8803333Z 2022-11-23T02:25:21.8803465Z Generating XML reports... 2022-11-23T02:25:21.8803939Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021004.xml 2022-11-23T02:25:21.8804316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8804500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8804895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8805091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8805112Z 2022-11-23T02:25:21.8805227Z Running tests... 2022-11-23T02:25:21.8805472Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8805789Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8806074Z test_builtin_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8806298Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41094 2022-11-23T02:25:21.8806516Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41095 2022-11-23T02:25:21.8806892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8807076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8807506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8807686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8808057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8808238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8808617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8808812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8809047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8809275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8809600Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpflb5qqp2 2022-11-23T02:25:21.8809855Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpflb5qqp2/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8810117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6wem4adr 2022-11-23T02:25:21.8810390Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6wem4adr/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8810499Z ok (6.908s) 2022-11-23T02:25:21.8810519Z 2022-11-23T02:25:21.8810796Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8810916Z Ran 1 test in 6.908s 2022-11-23T02:25:21.8810936Z 2022-11-23T02:25:21.8811037Z OK 2022-11-23T02:25:21.8811056Z 2022-11-23T02:25:21.8811189Z Generating XML reports... 2022-11-23T02:25:21.8811652Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021013.xml 2022-11-23T02:25:21.8812041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8812198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8812585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8812783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8812803Z 2022-11-23T02:25:21.8812919Z Running tests... 2022-11-23T02:25:21.8813189Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8813504Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8813804Z test_builtin_ddp_comm_hooks_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8814006Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41315 2022-11-23T02:25:21.8814236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41316 2022-11-23T02:25:21.8814614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8814795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8815231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8815373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8815741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8815921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8816297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8816523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8816763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8817002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8817266Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3wi5aamq 2022-11-23T02:25:21.8817543Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3wi5aamq/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8817804Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2yzatx6h 2022-11-23T02:25:21.8818081Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2yzatx6h/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8818190Z ok (6.895s) 2022-11-23T02:25:21.8818210Z 2022-11-23T02:25:21.8818484Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8818624Z Ran 1 test in 6.895s 2022-11-23T02:25:21.8818646Z 2022-11-23T02:25:21.8818751Z OK 2022-11-23T02:25:21.8818770Z 2022-11-23T02:25:21.8818901Z Generating XML reports... 2022-11-23T02:25:21.8819368Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021023.xml 2022-11-23T02:25:21.8819747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8819937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8820321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8820516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8820536Z 2022-11-23T02:25:21.8820626Z Running tests... 2022-11-23T02:25:21.8820892Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8821217Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8821496Z test_channels_last_contig (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8821817Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41536 2022-11-23T02:25:21.8822053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41537 2022-11-23T02:25:21.8822438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8822622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8822981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8823179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8823553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8823734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8824390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8824683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8824909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8825143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8825259Z ok (6.779s) 2022-11-23T02:25:21.8825259Z 2022-11-23T02:25:21.8825526Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8825662Z Ran 1 test in 6.779s 2022-11-23T02:25:21.8825683Z 2022-11-23T02:25:21.8825774Z OK 2022-11-23T02:25:21.8825801Z 2022-11-23T02:25:21.8825928Z Generating XML reports... 2022-11-23T02:25:21.8826382Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021032.xml 2022-11-23T02:25:21.8826771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8826953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8827340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8827513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8827560Z 2022-11-23T02:25:21.8827651Z Running tests... 2022-11-23T02:25:21.8827921Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8828237Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8828538Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8828919Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8829147Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41749 2022-11-23T02:25:21.8829371Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41750 2022-11-23T02:25:21.8829747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8829904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8830287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8830488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8830857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8831039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8831414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8831613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8831848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8832050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8832311Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplfy8mz08 2022-11-23T02:25:21.8832589Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplfy8mz08/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8832843Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_vu5yjri 2022-11-23T02:25:21.8833121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_vu5yjri/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8833235Z ok (6.219s) 2022-11-23T02:25:21.8833254Z 2022-11-23T02:25:21.8833531Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8833648Z Ran 1 test in 6.220s 2022-11-23T02:25:21.8833668Z 2022-11-23T02:25:21.8833765Z OK 2022-11-23T02:25:21.8833784Z 2022-11-23T02:25:21.8833888Z Generating XML reports... 2022-11-23T02:25:21.8834354Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021042.xml 2022-11-23T02:25:21.8834726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8834906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8835285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8835484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8835552Z 2022-11-23T02:25:21.8835674Z Running tests... 2022-11-23T02:25:21.8835947Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8836242Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8836491Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8836764Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8836988Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41969 2022-11-23T02:25:21.8837208Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41970 2022-11-23T02:25:21.8837580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8837826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8838217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8838416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8838760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8838941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8839321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8839510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8839746Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8839981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8840249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5l4hu7az 2022-11-23T02:25:21.8840528Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5l4hu7az/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8840856Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvfkeomev 2022-11-23T02:25:21.8841039Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvfkeomev/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8841146Z ok (6.322s) 2022-11-23T02:25:21.8841165Z 2022-11-23T02:25:21.8841437Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8841552Z Ran 1 test in 6.322s 2022-11-23T02:25:21.8841571Z 2022-11-23T02:25:21.8841666Z OK 2022-11-23T02:25:21.8841685Z 2022-11-23T02:25:21.8841815Z Generating XML reports... 2022-11-23T02:25:21.8842281Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021050.xml 2022-11-23T02:25:21.8842661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8842821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8843205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8843404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8843424Z 2022-11-23T02:25:21.8843539Z Running tests... 2022-11-23T02:25:21.8843906Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8844132Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8844379Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8844636Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8844891Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42189 2022-11-23T02:25:21.8845118Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42190 2022-11-23T02:25:21.8845502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8845686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8846070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8846267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8846637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8846813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8847248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8847418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8847658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8847897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8848161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcgffpb8h 2022-11-23T02:25:21.8848441Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcgffpb8h/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8848700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnrcm6645 2022-11-23T02:25:21.8848974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnrcm6645/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8849275Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8849440Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8849678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8849914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8850829Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8850951Z warnings.warn( 2022-11-23T02:25:21.8851870Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8851993Z warnings.warn( 2022-11-23T02:25:21.8852232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8852467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8852701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8852911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8853149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8853380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8853616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8853896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8854010Z ok (6.373s) 2022-11-23T02:25:21.8854030Z 2022-11-23T02:25:21.8854308Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8854429Z Ran 1 test in 6.373s 2022-11-23T02:25:21.8854448Z 2022-11-23T02:25:21.8854546Z OK 2022-11-23T02:25:21.8854565Z 2022-11-23T02:25:21.8854671Z Generating XML reports... 2022-11-23T02:25:21.8855144Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021059.xml 2022-11-23T02:25:21.8855522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8855703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8856122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8856350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8856370Z 2022-11-23T02:25:21.8856484Z Running tests... 2022-11-23T02:25:21.8856760Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8857055Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8857303Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8857561Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8857782Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42409 2022-11-23T02:25:21.8858004Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42410 2022-11-23T02:25:21.8858382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8858571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8858955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8859151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8859579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8859672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8860050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8860242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8860476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8860716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8860984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl8kybpzj 2022-11-23T02:25:21.8861259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl8kybpzj/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8861494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphouq8qi8 2022-11-23T02:25:21.8861768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphouq8qi8/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8862009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8862249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8862483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8862721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8863685Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8863815Z warnings.warn( 2022-11-23T02:25:21.8865210Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8865331Z warnings.warn( 2022-11-23T02:25:21.8865617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8865863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8866030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8866265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8866495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8866724Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8866958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8867193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8867298Z ok (6.403s) 2022-11-23T02:25:21.8867317Z 2022-11-23T02:25:21.8867575Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8867699Z Ran 1 test in 6.403s 2022-11-23T02:25:21.8867718Z 2022-11-23T02:25:21.8867818Z OK 2022-11-23T02:25:21.8867837Z 2022-11-23T02:25:21.8867968Z Generating XML reports... 2022-11-23T02:25:21.8868439Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021108.xml 2022-11-23T02:25:21.8868820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8869004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8869384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8869559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8869603Z 2022-11-23T02:25:21.8869693Z Running tests... 2022-11-23T02:25:21.8869968Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8870287Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8870563Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8870918Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8871142Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42629 2022-11-23T02:25:21.8871360Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42630 2022-11-23T02:25:21.8871734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8871892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8872276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8872536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8872922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8873100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8873475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8873673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8873913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8874123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8874390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp18776dbv 2022-11-23T02:25:21.8874666Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp18776dbv/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8874982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoorc9je3 2022-11-23T02:25:21.8875257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoorc9je3/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8875499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8875742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8875979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8876218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8876303Z ok (6.290s) 2022-11-23T02:25:21.8876322Z 2022-11-23T02:25:21.8876597Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8876713Z Ran 1 test in 6.291s 2022-11-23T02:25:21.8876735Z 2022-11-23T02:25:21.8876834Z OK 2022-11-23T02:25:21.8876853Z 2022-11-23T02:25:21.8876984Z Generating XML reports... 2022-11-23T02:25:21.8877453Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021117.xml 2022-11-23T02:25:21.8877830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8878077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8878372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8878568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8878587Z 2022-11-23T02:25:21.8878700Z Running tests... 2022-11-23T02:25:21.8878970Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8879286Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8879561Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8879919Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8880143Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42849 2022-11-23T02:25:21.8880366Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42850 2022-11-23T02:25:21.8880720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8880904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8881284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8881484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8881907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8882095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8882480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8882678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8882890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8883131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8883401Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq4lgvw_z 2022-11-23T02:25:21.8883683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq4lgvw_z/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8884044Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpakkupuyw 2022-11-23T02:25:21.8884325Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpakkupuyw/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8884572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8884810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8885046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8885253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8885361Z ok (6.293s) 2022-11-23T02:25:21.8885382Z 2022-11-23T02:25:21.8885662Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8885780Z Ran 1 test in 6.293s 2022-11-23T02:25:21.8885799Z 2022-11-23T02:25:21.8885899Z OK 2022-11-23T02:25:21.8885918Z 2022-11-23T02:25:21.8886055Z Generating XML reports... 2022-11-23T02:25:21.8886529Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021126.xml 2022-11-23T02:25:21.8886910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8887066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8887452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8887653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8887673Z 2022-11-23T02:25:21.8887789Z Running tests... 2022-11-23T02:25:21.8888058Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8888371Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8888600Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8891330Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8891612Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43069 2022-11-23T02:25:21.8891847Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43070 2022-11-23T02:25:21.8892144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8892323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8892704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8892896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8893258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8893474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8893863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8894104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8894293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8894517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8894779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpujbo7537 2022-11-23T02:25:21.8895050Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpujbo7537/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8895307Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ucrxru5 2022-11-23T02:25:21.8895633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ucrxru5/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8895855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8896094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8896971Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:25:21.8897662Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:25:21.8897915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8898150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8898254Z ok (6.429s) 2022-11-23T02:25:21.8898274Z 2022-11-23T02:25:21.8898550Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8898667Z Ran 1 test in 6.430s 2022-11-23T02:25:21.8898686Z 2022-11-23T02:25:21.8898782Z OK 2022-11-23T02:25:21.8898800Z 2022-11-23T02:25:21.8898926Z Generating XML reports... 2022-11-23T02:25:21.8899399Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021134.xml 2022-11-23T02:25:21.8899752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8899932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8900314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8900507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8900527Z 2022-11-23T02:25:21.8900641Z Running tests... 2022-11-23T02:25:21.8900904Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8901216Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8901458Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8901891Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8902099Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43289 2022-11-23T02:25:21.8902322Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43290 2022-11-23T02:25:21.8902699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8902877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8903255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8903445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8903810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8904294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8904759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8904954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8905181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8905406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8905671Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5dv1o4kg 2022-11-23T02:25:21.8905927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5dv1o4kg/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8906110Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppuz__42a 2022-11-23T02:25:21.8906378Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppuz__42a/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8906469Z ok (6.309s) 2022-11-23T02:25:21.8906514Z 2022-11-23T02:25:21.8906761Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8906875Z Ran 1 test in 6.310s 2022-11-23T02:25:21.8906894Z 2022-11-23T02:25:21.8906997Z OK 2022-11-23T02:25:21.8907016Z 2022-11-23T02:25:21.8907149Z Generating XML reports... 2022-11-23T02:25:21.8907614Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021143.xml 2022-11-23T02:25:21.8907985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8908162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8908539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8908716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8908756Z 2022-11-23T02:25:21.8908850Z Running tests... 2022-11-23T02:25:21.8909197Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8909430Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8909673Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8909945Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8910169Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43509 2022-11-23T02:25:21.8910389Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43510 2022-11-23T02:25:21.8910739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8910919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8911373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8911584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8911957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8912137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8912509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8912704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8912936Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8913145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8913473Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3vicrzr 2022-11-23T02:25:21.8913750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3vicrzr/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8914076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplp5zayv3 2022-11-23T02:25:21.8914276Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplp5zayv3/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8914512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8914757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8914994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8915245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8915315Z ok (6.193s) 2022-11-23T02:25:21.8915338Z 2022-11-23T02:25:21.8915622Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8915737Z Ran 1 test in 6.193s 2022-11-23T02:25:21.8915756Z 2022-11-23T02:25:21.8915852Z OK 2022-11-23T02:25:21.8915870Z 2022-11-23T02:25:21.8915996Z Generating XML reports... 2022-11-23T02:25:21.8916459Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021152.xml 2022-11-23T02:25:21.8916828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8917006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8917366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8917561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8917581Z 2022-11-23T02:25:21.8917697Z Running tests... 2022-11-23T02:25:21.8917968Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8918323Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8918546Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8918820Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8919041Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43729 2022-11-23T02:25:21.8919240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43730 2022-11-23T02:25:21.8919612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8919794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8920172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8920440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8920817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8920994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8921377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8921567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8921869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8922108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8922367Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6gpvv4tx 2022-11-23T02:25:21.8922697Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6gpvv4tx/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8922954Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd6228a2g 2022-11-23T02:25:21.8923222Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd6228a2g/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8924007Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:25:21.8924787Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:25:21.8925712Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8925828Z warnings.warn( 2022-11-23T02:25:21.8926745Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8926863Z warnings.warn( 2022-11-23T02:25:21.8927105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8927345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8927559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8927794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8927897Z ok (6.438s) 2022-11-23T02:25:21.8927917Z 2022-11-23T02:25:21.8928187Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8928305Z Ran 1 test in 6.438s 2022-11-23T02:25:21.8928325Z 2022-11-23T02:25:21.8928467Z OK 2022-11-23T02:25:21.8928488Z 2022-11-23T02:25:21.8928620Z Generating XML reports... 2022-11-23T02:25:21.8929084Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021201.xml 2022-11-23T02:25:21.8929434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8929613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8929994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8930190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8930210Z 2022-11-23T02:25:21.8930324Z Running tests... 2022-11-23T02:25:21.8930592Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8930977Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8931230Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8931507Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8931707Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43949 2022-11-23T02:25:21.8931928Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43950 2022-11-23T02:25:21.8932300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8932478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8932856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8933053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8933421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8933597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8933950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8934141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8934373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8934604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8934863Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy_vhjoyf 2022-11-23T02:25:21.8935185Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy_vhjoyf/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8935486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphxt0r_va 2022-11-23T02:25:21.8935743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphxt0r_va/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8936587Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8936702Z warnings.warn( 2022-11-23T02:25:21.8937637Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:25:21.8937766Z warnings.warn( 2022-11-23T02:25:21.8938007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8938242Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8938476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8938710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8938814Z ok (6.305s) 2022-11-23T02:25:21.8938833Z 2022-11-23T02:25:21.8939101Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8939217Z Ran 1 test in 6.305s 2022-11-23T02:25:21.8939236Z 2022-11-23T02:25:21.8939309Z OK 2022-11-23T02:25:21.8939327Z 2022-11-23T02:25:21.8939454Z Generating XML reports... 2022-11-23T02:25:21.8939980Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021209.xml 2022-11-23T02:25:21.8940355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8940534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8940917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8941111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8941130Z 2022-11-23T02:25:21.8941243Z Running tests... 2022-11-23T02:25:21.8941511Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8941805Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8942069Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8942318Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8942541Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44169 2022-11-23T02:25:21.8942761Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44170 2022-11-23T02:25:21.8943132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8943309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8943688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8944124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8944606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8944771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8945162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8945351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8945584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8945811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8945988Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdmxwg8wo 2022-11-23T02:25:21.8946298Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjd7lv5el 2022-11-23T02:25:21.8946578Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdmxwg8wo/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8946851Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjd7lv5el/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8947173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8947421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8947656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8947890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8947995Z ok (6.323s) 2022-11-23T02:25:21.8948015Z 2022-11-23T02:25:21.8948292Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8948385Z Ran 1 test in 6.323s 2022-11-23T02:25:21.8948405Z 2022-11-23T02:25:21.8948499Z OK 2022-11-23T02:25:21.8948518Z 2022-11-23T02:25:21.8948644Z Generating XML reports... 2022-11-23T02:25:21.8949111Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021218.xml 2022-11-23T02:25:21.8949554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8949737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8950116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8950309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8950328Z 2022-11-23T02:25:21.8950438Z Running tests... 2022-11-23T02:25:21.8950680Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8950991Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8951251Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8951491Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8951722Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44389 2022-11-23T02:25:21.8951942Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44390 2022-11-23T02:25:21.8952313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8952493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8952855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8953048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8953414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8953590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8953964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8954163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8954395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8954625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8954862Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9xjr8wh2 2022-11-23T02:25:21.8955137Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9xjr8wh2/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8955393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn62ybthq 2022-11-23T02:25:21.8955662Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn62ybthq/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8955899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8956188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8956522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8956660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8956894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8957103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8957334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8957564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.8957667Z ok (6.298s) 2022-11-23T02:25:21.8957686Z 2022-11-23T02:25:21.8957957Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8958121Z Ran 1 test in 6.298s 2022-11-23T02:25:21.8958141Z 2022-11-23T02:25:21.8958238Z OK 2022-11-23T02:25:21.8958257Z 2022-11-23T02:25:21.8958384Z Generating XML reports... 2022-11-23T02:25:21.8958828Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021227.xml 2022-11-23T02:25:21.8959198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8959375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8959751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8959964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8959965Z 2022-11-23T02:25:21.8960075Z Running tests... 2022-11-23T02:25:21.8960337Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8960657Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8960949Z test_ddp_comm_hook_allreduce_hook_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8961150Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44609 2022-11-23T02:25:21.8961373Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44610 2022-11-23T02:25:21.8961744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8961924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8962301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8962498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8962862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8963041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8963394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8963587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8963822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8964053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8964310Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ebc3922 2022-11-23T02:25:21.8964579Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ebc3922/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8964834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulvhz2nw 2022-11-23T02:25:21.8965161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulvhz2nw/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8965274Z ok (6.897s) 2022-11-23T02:25:21.8965294Z 2022-11-23T02:25:21.8965546Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8965664Z Ran 1 test in 6.897s 2022-11-23T02:25:21.8965683Z 2022-11-23T02:25:21.8965778Z OK 2022-11-23T02:25:21.8965798Z 2022-11-23T02:25:21.8965923Z Generating XML reports... 2022-11-23T02:25:21.8966387Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021236.xml 2022-11-23T02:25:21.8966761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8966941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8967323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8967567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8967608Z 2022-11-23T02:25:21.8967698Z Running tests... 2022-11-23T02:25:21.8967965Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8968281Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8968589Z test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8968812Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44830 2022-11-23T02:25:21.8969030Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44831 2022-11-23T02:25:21.8969400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8969576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8969940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8970133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8970494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8970671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8971048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8971238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8971473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8971706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8971949Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps6rhdk4v 2022-11-23T02:25:21.8972227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps6rhdk4v/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8972485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvhyl7nnn 2022-11-23T02:25:21.8972758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvhyl7nnn/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8972864Z ok (6.899s) 2022-11-23T02:25:21.8972885Z 2022-11-23T02:25:21.8973155Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8973270Z Ran 1 test in 6.900s 2022-11-23T02:25:21.8973290Z 2022-11-23T02:25:21.8973386Z OK 2022-11-23T02:25:21.8973404Z 2022-11-23T02:25:21.8973530Z Generating XML reports... 2022-11-23T02:25:21.8973973Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021245.xml 2022-11-23T02:25:21.8974398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8974585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8974968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8975162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8975181Z 2022-11-23T02:25:21.8975292Z Running tests... 2022-11-23T02:25:21.8975557Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8975868Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8976156Z test_ddp_comm_hook_allreduce_hook_nccl_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8976379Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45051 2022-11-23T02:25:21.8976649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45052 2022-11-23T02:25:21.8977021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8977202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8977577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8977768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8978130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8978305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8978660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8978853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8979089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8979314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8979577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4tlug00t 2022-11-23T02:25:21.8979847Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4tlug00t/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8980102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjuse0t11 2022-11-23T02:25:21.8980371Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjuse0t11/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8980453Z ok (6.887s) 2022-11-23T02:25:21.8980494Z 2022-11-23T02:25:21.8980740Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8980858Z Ran 1 test in 6.887s 2022-11-23T02:25:21.8980877Z 2022-11-23T02:25:21.8980976Z OK 2022-11-23T02:25:21.8980998Z 2022-11-23T02:25:21.8981124Z Generating XML reports... 2022-11-23T02:25:21.8981588Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021255.xml 2022-11-23T02:25:21.8981956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8982131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8982513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8982685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8982704Z 2022-11-23T02:25:21.8982816Z Running tests... 2022-11-23T02:25:21.8983082Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8983398Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8983687Z test_ddp_comm_hook_allreduce_with_then_hook_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8984363Z This unit test verifies whether a DDP communication hook that calls allreduce and then ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8984643Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45272 2022-11-23T02:25:21.8984911Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45273 2022-11-23T02:25:21.8985260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8985446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8985832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8986024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8986392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8986570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8986943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8987134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8987364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8987574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8987831Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulr9pzhc 2022-11-23T02:25:21.8988106Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulr9pzhc/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8988368Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1tcatyl_ 2022-11-23T02:25:21.8988638Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1tcatyl_/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8988742Z ok (6.934s) 2022-11-23T02:25:21.8988761Z 2022-11-23T02:25:21.8989033Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8989148Z Ran 1 test in 6.934s 2022-11-23T02:25:21.8989167Z 2022-11-23T02:25:21.8989240Z OK 2022-11-23T02:25:21.8989279Z 2022-11-23T02:25:21.8989455Z Generating XML reports... 2022-11-23T02:25:21.8989845Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021304.xml 2022-11-23T02:25:21.8990219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8990398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8990782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8990973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8990993Z 2022-11-23T02:25:21.8991104Z Running tests... 2022-11-23T02:25:21.8991368Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8991657Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8991975Z test_ddp_comm_hook_future_passing_gpu_nccl (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.8992209Z This unit test verifies whether the Future object is passed properly using nccl backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.8992406Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45493 2022-11-23T02:25:21.8992625Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45494 2022-11-23T02:25:21.8993059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8993248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8993633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8993803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8994171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8994346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8994721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8994910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8995142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.8995422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.8995683Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpef4h5_6n 2022-11-23T02:25:21.8995953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpef4h5_6n/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8996187Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbqcvow_o 2022-11-23T02:25:21.8996461Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbqcvow_o/_remote_module_non_scriptable.py 2022-11-23T02:25:21.8996565Z ok (6.900s) 2022-11-23T02:25:21.8996584Z 2022-11-23T02:25:21.8996853Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8996969Z Ran 1 test in 6.901s 2022-11-23T02:25:21.8996988Z 2022-11-23T02:25:21.8997085Z OK 2022-11-23T02:25:21.8997103Z 2022-11-23T02:25:21.8997304Z Generating XML reports... 2022-11-23T02:25:21.8997696Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021313.xml 2022-11-23T02:25:21.8998046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.8998226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.8998608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.8998802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.8998822Z 2022-11-23T02:25:21.8998933Z Running tests... 2022-11-23T02:25:21.8999199Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.8999507Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.8999797Z test_ddp_multi_device_module_config (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9000025Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45714 2022-11-23T02:25:21.9000225Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45715 2022-11-23T02:25:21.9000597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9000774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9001153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9001348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9001711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9001886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9002313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9002490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9002724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9002957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9003062Z ok (6.907s) 2022-11-23T02:25:21.9003081Z 2022-11-23T02:25:21.9003351Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9003467Z Ran 1 test in 6.907s 2022-11-23T02:25:21.9003486Z 2022-11-23T02:25:21.9003578Z OK 2022-11-23T02:25:21.9003597Z 2022-11-23T02:25:21.9003724Z Generating XML reports... 2022-11-23T02:25:21.9004179Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021323.xml 2022-11-23T02:25:21.9004581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9004759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9005136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9005329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9005348Z 2022-11-23T02:25:21.9005460Z Running tests... 2022-11-23T02:25:21.9005721Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9006029Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9006237Z test_ddp_packed_sequence (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.9006476Z Tests that DDP with ``device_ids`` specified can run a forward and ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9006704Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45921 2022-11-23T02:25:21.9006927Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45922 2022-11-23T02:25:21.9007297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9007477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9007856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9008049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9008418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9008596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9008950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9009148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9009381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9009630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9009862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9010106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9010506Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9010904Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9011148Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe5z1ow4y 2022-11-23T02:25:21.9011471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe5z1ow4y/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9011740Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeshtc40w 2022-11-23T02:25:21.9012014Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeshtc40w/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9012968Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/replicated_tensor.py:113: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/native/cudnn/RNN.cpp:982.) 2022-11-23T02:25:21.9013094Z rs = func(*args, **kwargs) 2022-11-23T02:25:21.9014102Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/replicated_tensor.py:113: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/native/cudnn/RNN.cpp:982.) 2022-11-23T02:25:21.9014216Z rs = func(*args, **kwargs) 2022-11-23T02:25:21.9014321Z ok (7.822s) 2022-11-23T02:25:21.9014340Z 2022-11-23T02:25:21.9014608Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9014721Z Ran 1 test in 7.822s 2022-11-23T02:25:21.9014741Z 2022-11-23T02:25:21.9014813Z OK 2022-11-23T02:25:21.9014832Z 2022-11-23T02:25:21.9014961Z Generating XML reports... 2022-11-23T02:25:21.9015420Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021332.xml 2022-11-23T02:25:21.9015788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9015974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9016356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9016552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9016571Z 2022-11-23T02:25:21.9016684Z Running tests... 2022-11-23T02:25:21.9016949Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9017244Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9017519Z test_ddp_weight_sharing (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9017744Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46142 2022-11-23T02:25:21.9017964Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46143 2022-11-23T02:25:21.9018341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9018521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9018903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9019099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9019445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9019618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9019994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9020184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9020464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9020703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9020961Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsaosaq6m 2022-11-23T02:25:21.9021233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsaosaq6m/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9021494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxxywj5aa 2022-11-23T02:25:21.9021744Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxxywj5aa/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9022066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9022309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9022595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9022826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9023060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9023297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9023529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9023735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9023933Z ok (7.187s) 2022-11-23T02:25:21.9024130Z 2022-11-23T02:25:21.9024503Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9024635Z Ran 1 test in 7.187s 2022-11-23T02:25:21.9024635Z 2022-11-23T02:25:21.9024744Z OK 2022-11-23T02:25:21.9024744Z 2022-11-23T02:25:21.9024904Z Generating XML reports... 2022-11-23T02:25:21.9025379Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021342.xml 2022-11-23T02:25:21.9025753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9025911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9026198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9026391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9026411Z 2022-11-23T02:25:21.9026524Z Running tests... 2022-11-23T02:25:21.9026789Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9027104Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9027386Z test_ddp_with_lazy_parameters (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9027613Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46363 2022-11-23T02:25:21.9027833Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46364 2022-11-23T02:25:21.9028186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9028365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9028748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9028942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9029302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9029471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9029953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9030153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9030364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9030900Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-11-23T02:25:21.9031171Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-11-23T02:25:21.9031426Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxibsdj0w 2022-11-23T02:25:21.9031695Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxibsdj0w/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9031921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9032543Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-11-23T02:25:21.9032816Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-11-23T02:25:21.9033068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaur5yee6 2022-11-23T02:25:21.9033339Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaur5yee6/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9033425Z ok (4.095s) 2022-11-23T02:25:21.9033445Z 2022-11-23T02:25:21.9033712Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9033825Z Ran 1 test in 4.096s 2022-11-23T02:25:21.9033844Z 2022-11-23T02:25:21.9033939Z OK 2022-11-23T02:25:21.9033958Z 2022-11-23T02:25:21.9034083Z Generating XML reports... 2022-11-23T02:25:21.9034556Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021352.xml 2022-11-23T02:25:21.9034927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9035109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9035467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9035661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9035681Z 2022-11-23T02:25:21.9035789Z Running tests... 2022-11-23T02:25:21.9036072Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9036364Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9036647Z test_default_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9036879Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46566 2022-11-23T02:25:21.9037096Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46567 2022-11-23T02:25:21.9037467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9037626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9037992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9038168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9038544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9038739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9039113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9039357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9039595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9039804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9040062Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0c7iabkx 2022-11-23T02:25:21.9040332Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0c7iabkx/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9040585Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmjio6eaf 2022-11-23T02:25:21.9040854Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmjio6eaf/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9040959Z ok (6.921s) 2022-11-23T02:25:21.9041023Z 2022-11-23T02:25:21.9041301Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9041419Z Ran 1 test in 6.921s 2022-11-23T02:25:21.9041438Z 2022-11-23T02:25:21.9041533Z OK 2022-11-23T02:25:21.9041552Z 2022-11-23T02:25:21.9041659Z Generating XML reports... 2022-11-23T02:25:21.9042127Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021359.xml 2022-11-23T02:25:21.9042499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9042679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9043059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9043253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9043272Z 2022-11-23T02:25:21.9043383Z Running tests... 2022-11-23T02:25:21.9043653Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9043946Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9044336Z test_default_ddp_comm_hooks_nccl_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9044470Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46787 2022-11-23T02:25:21.9044690Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46788 2022-11-23T02:25:21.9045063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9045240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9045618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9045812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9046189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9046346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9046721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9046911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9047146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9047378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9047708Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5v85n272 2022-11-23T02:25:21.9047996Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5v85n272/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9048223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr9gch4zm 2022-11-23T02:25:21.9048480Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr9gch4zm/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9048589Z ok (7.018s) 2022-11-23T02:25:21.9048608Z 2022-11-23T02:25:21.9048878Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9048995Z Ran 1 test in 7.019s 2022-11-23T02:25:21.9049015Z 2022-11-23T02:25:21.9049111Z OK 2022-11-23T02:25:21.9049130Z 2022-11-23T02:25:21.9049258Z Generating XML reports... 2022-11-23T02:25:21.9049720Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021408.xml 2022-11-23T02:25:21.9050092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9050269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9050687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9050879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9050898Z 2022-11-23T02:25:21.9051009Z Running tests... 2022-11-23T02:25:21.9051274Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9051590Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9051861Z test_failure_recovery (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9052085Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47008 2022-11-23T02:25:21.9052305Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47009 2022-11-23T02:25:21.9052657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9052839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9053222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9053419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9053788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9053965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9054338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9054533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9054766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9054975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9055247Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3mz93y10 2022-11-23T02:25:21.9055516Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3mz93y10/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9055771Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2lgwaygv 2022-11-23T02:25:21.9056039Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2lgwaygv/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9056274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9056511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9056775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9056994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9057067Z ok (7.510s) 2022-11-23T02:25:21.9057086Z 2022-11-23T02:25:21.9057410Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9057531Z Ran 1 test in 7.511s 2022-11-23T02:25:21.9057550Z 2022-11-23T02:25:21.9057646Z OK 2022-11-23T02:25:21.9057666Z 2022-11-23T02:25:21.9057792Z Generating XML reports... 2022-11-23T02:25:21.9058258Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021417.xml 2022-11-23T02:25:21.9058625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9058900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9059223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9059418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9059527Z 2022-11-23T02:25:21.9059638Z Running tests... 2022-11-23T02:25:21.9059826Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9060141Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9060519Z test_find_unused_parameters_kwarg_debug_detail (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9061216Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82632 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.674s) 2022-11-23T02:25:21.9061237Z 2022-11-23T02:25:21.9061506Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9061713Z Ran 1 test in 1.675s 2022-11-23T02:25:21.9061713Z 2022-11-23T02:25:21.9061813Z OK (skipped=1) 2022-11-23T02:25:21.9061855Z 2022-11-23T02:25:21.9061882Z Generating XML reports... 2022-11-23T02:25:21.9062338Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021427.xml 2022-11-23T02:25:21.9062711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9062888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9063269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9063462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9063482Z 2022-11-23T02:25:21.9063592Z Running tests... 2022-11-23T02:25:21.9064161Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9064461Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9064794Z test_find_unused_parameters_kwarg_debug_info (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9065550Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/83301 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.679s) 2022-11-23T02:25:21.9065550Z 2022-11-23T02:25:21.9065847Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9065900Z Ran 1 test in 1.679s 2022-11-23T02:25:21.9065900Z 2022-11-23T02:25:21.9065993Z OK (skipped=1) 2022-11-23T02:25:21.9066013Z 2022-11-23T02:25:21.9066139Z Generating XML reports... 2022-11-23T02:25:21.9066597Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021432.xml 2022-11-23T02:25:21.9067043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9067232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9067593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9067786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9067805Z 2022-11-23T02:25:21.9067915Z Running tests... 2022-11-23T02:25:21.9068258Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9068554Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9068790Z test_find_unused_parameters_kwarg_debug_off (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9069595Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82385 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.672s) 2022-11-23T02:25:21.9069682Z 2022-11-23T02:25:21.9069895Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9070012Z Ran 1 test in 1.673s 2022-11-23T02:25:21.9070032Z 2022-11-23T02:25:21.9070119Z OK (skipped=1) 2022-11-23T02:25:21.9070162Z 2022-11-23T02:25:21.9070270Z Generating XML reports... 2022-11-23T02:25:21.9070799Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021436.xml 2022-11-23T02:25:21.9071135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9071371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9071714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9071941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9071941Z 2022-11-23T02:25:21.9072075Z Running tests... 2022-11-23T02:25:21.9072269Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9072560Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9072880Z test_find_unused_parameters_kwarg_grad_is_view_debug_detail (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9073625Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82979 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.715s) 2022-11-23T02:25:21.9073646Z 2022-11-23T02:25:21.9073914Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9074030Z Ran 1 test in 1.715s 2022-11-23T02:25:21.9074050Z 2022-11-23T02:25:21.9074159Z OK (skipped=1) 2022-11-23T02:25:21.9074179Z 2022-11-23T02:25:21.9074367Z Generating XML reports... 2022-11-23T02:25:21.9074763Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021440.xml 2022-11-23T02:25:21.9075131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9075307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9075675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9075859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9075879Z 2022-11-23T02:25:21.9075988Z Running tests... 2022-11-23T02:25:21.9076256Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9076678Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9076939Z test_find_unused_parameters_kwarg_grad_is_view_debug_info (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9077691Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82400 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.687s) 2022-11-23T02:25:21.9077711Z 2022-11-23T02:25:21.9078038Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9078090Z Ran 1 test in 1.687s 2022-11-23T02:25:21.9078109Z 2022-11-23T02:25:21.9078197Z OK (skipped=1) 2022-11-23T02:25:21.9078238Z 2022-11-23T02:25:21.9078392Z Generating XML reports... 2022-11-23T02:25:21.9078860Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021444.xml 2022-11-23T02:25:21.9079235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9079412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9079886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9079988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9080007Z 2022-11-23T02:25:21.9080194Z Running tests... 2022-11-23T02:25:21.9080399Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9080676Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9080991Z test_find_unused_parameters_kwarg_grad_is_view_debug_off (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9081738Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82500 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.711s) 2022-11-23T02:25:21.9081759Z 2022-11-23T02:25:21.9082027Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9082142Z Ran 1 test in 1.712s 2022-11-23T02:25:21.9082161Z 2022-11-23T02:25:21.9082270Z OK (skipped=1) 2022-11-23T02:25:21.9082289Z 2022-11-23T02:25:21.9082417Z Generating XML reports... 2022-11-23T02:25:21.9082873Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021448.xml 2022-11-23T02:25:21.9083329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9083515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9083873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9084061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9084061Z 2022-11-23T02:25:21.9084199Z Running tests... 2022-11-23T02:25:21.9084443Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9084758Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9084962Z test_fp16 (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9085139Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47636 2022-11-23T02:25:21.9085343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47637 2022-11-23T02:25:21.9085745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9085928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9086303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9086490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9086845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9087016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9087390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9087577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9087854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9088070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9088329Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdkyo8z4m 2022-11-23T02:25:21.9088600Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdkyo8z4m/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9088851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4z_uensi 2022-11-23T02:25:21.9089114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4z_uensi/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9089215Z ok (7.291s) 2022-11-23T02:25:21.9089234Z 2022-11-23T02:25:21.9089502Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9089615Z Ran 1 test in 7.291s 2022-11-23T02:25:21.9089634Z 2022-11-23T02:25:21.9089706Z OK 2022-11-23T02:25:21.9089753Z 2022-11-23T02:25:21.9089860Z Generating XML reports... 2022-11-23T02:25:21.9090322Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021452.xml 2022-11-23T02:25:21.9090696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9090877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9091257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9091452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9091471Z 2022-11-23T02:25:21.9091582Z Running tests... 2022-11-23T02:25:21.9091843Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9092134Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9092423Z test_fp16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9092642Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47857 2022-11-23T02:25:21.9092859Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47858 2022-11-23T02:25:21.9093226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9093400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9093774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9093963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9094310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9094484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9094906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9095102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9095332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9095876Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9096100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9096637Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9096982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltka8b_r 2022-11-23T02:25:21.9097252Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltka8b_r/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9097503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0zykuhm7 2022-11-23T02:25:21.9097754Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0zykuhm7/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9097856Z ok (6.910s) 2022-11-23T02:25:21.9097875Z 2022-11-23T02:25:21.9098147Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9098259Z Ran 1 test in 6.910s 2022-11-23T02:25:21.9098279Z 2022-11-23T02:25:21.9098374Z OK 2022-11-23T02:25:21.9098393Z 2022-11-23T02:25:21.9098516Z Generating XML reports... 2022-11-23T02:25:21.9098975Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021502.xml 2022-11-23T02:25:21.9099346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9099503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9099882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9100073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9100092Z 2022-11-23T02:25:21.9100199Z Running tests... 2022-11-23T02:25:21.9100459Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9100789Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9101056Z test_fp16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9101272Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48078 2022-11-23T02:25:21.9101487Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48079 2022-11-23T02:25:21.9101838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9102013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9102390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9102579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9102945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9103128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9103550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9103750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9104275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9104871Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9105115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9105663Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9105901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1tiya49z 2022-11-23T02:25:21.9106170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1tiya49z/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9106421Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpljlg30zf 2022-11-23T02:25:21.9106687Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpljlg30zf/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9106787Z ok (6.931s) 2022-11-23T02:25:21.9106806Z 2022-11-23T02:25:21.9107080Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9107193Z Ran 1 test in 6.931s 2022-11-23T02:25:21.9107216Z 2022-11-23T02:25:21.9107289Z OK 2022-11-23T02:25:21.9107308Z 2022-11-23T02:25:21.9107436Z Generating XML reports... 2022-11-23T02:25:21.9107898Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021512.xml 2022-11-23T02:25:21.9108271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9108444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9108819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9109009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9109028Z 2022-11-23T02:25:21.9109135Z Running tests... 2022-11-23T02:25:21.9109381Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9109696Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9109964Z test_fp16_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9110185Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48299 2022-11-23T02:25:21.9110400Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48300 2022-11-23T02:25:21.9110766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9110939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9111314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9111503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9111850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9112172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9112482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9112670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9112897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9113122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9113373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmeizs5az 2022-11-23T02:25:21.9113639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmeizs5az/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9113872Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcltl71sm 2022-11-23T02:25:21.9114137Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcltl71sm/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9114294Z ok (7.409s) 2022-11-23T02:25:21.9114414Z 2022-11-23T02:25:21.9114584Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9114694Z Ran 1 test in 7.409s 2022-11-23T02:25:21.9114712Z 2022-11-23T02:25:21.9114803Z OK 2022-11-23T02:25:21.9114821Z 2022-11-23T02:25:21.9114943Z Generating XML reports... 2022-11-23T02:25:21.9115401Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021521.xml 2022-11-23T02:25:21.9115768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9115963Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9116298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9116492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9116511Z 2022-11-23T02:25:21.9116621Z Running tests... 2022-11-23T02:25:21.9116885Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9117194Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9117511Z test_grad_layout_1devicemodule_1replicaperprocess (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9117727Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48520 2022-11-23T02:25:21.9117924Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48521 2022-11-23T02:25:21.9118295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9118600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9119061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9119188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9119547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9119717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9120380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9120620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9120889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9121154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9121466Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8u8bl4en 2022-11-23T02:25:21.9121905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8u8bl4en/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9122229Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpstmvonve 2022-11-23T02:25:21.9122537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpstmvonve/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9122819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9145705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9145960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9146112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9146345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9146766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9146997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9147224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9147447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9147654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9147887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9148112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9148335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9148565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9148802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9149033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9149257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9149465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9149683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9149913Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9150142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9150376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9150576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9150804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9151021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9151248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9151453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9151684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9151905Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9152124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9152353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9152584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9152872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9153104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9153306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9153536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9153641Z ok (9.118s) 2022-11-23T02:25:21.9153663Z 2022-11-23T02:25:21.9153980Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9154088Z Ran 1 test in 9.118s 2022-11-23T02:25:21.9154108Z 2022-11-23T02:25:21.9154204Z OK 2022-11-23T02:25:21.9154224Z 2022-11-23T02:25:21.9154344Z Generating XML reports... 2022-11-23T02:25:21.9154819Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021531.xml 2022-11-23T02:25:21.9155242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9155430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9155816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9156008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9156028Z 2022-11-23T02:25:21.9156128Z Running tests... 2022-11-23T02:25:21.9156400Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9156717Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9156995Z test_grad_layout_2devicemodule (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9157208Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48741 2022-11-23T02:25:21.9157416Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48742 2022-11-23T02:25:21.9157794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9157966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9158340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9158536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9158897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9159064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9159434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9159610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9159849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9160080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9160343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmuyguua0 2022-11-23T02:25:21.9160623Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmuyguua0/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9160884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjrb3t90_ 2022-11-23T02:25:21.9161152Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjrb3t90_/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9161511Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9161748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9162135Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9162381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9162616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9162849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9163079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9163306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9163532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9163757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9164143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9164379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9164609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9164840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9165070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9165337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9165523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9165747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9165978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9166192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9166419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9166641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9166866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9167095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9167326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9167554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9167777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9167978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9168214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9168445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9168673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9168898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9169120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9169436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9169581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9169786Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9170017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9170324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9170441Z ok (11.396s) 2022-11-23T02:25:21.9170461Z 2022-11-23T02:25:21.9170742Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9170860Z Ran 1 test in 11.396s 2022-11-23T02:25:21.9170879Z 2022-11-23T02:25:21.9170975Z OK 2022-11-23T02:25:21.9170995Z 2022-11-23T02:25:21.9171123Z Generating XML reports... 2022-11-23T02:25:21.9171593Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021542.xml 2022-11-23T02:25:21.9171945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9172129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9172574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9172771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9172791Z 2022-11-23T02:25:21.9172902Z Running tests... 2022-11-23T02:25:21.9173171Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9173485Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9173764Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9173976Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48971 2022-11-23T02:25:21.9174191Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48972 2022-11-23T02:25:21.9174566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9174751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9175136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9175331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9175696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9175873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9176252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9176422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9176657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9176884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9177540Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9177992Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9178585Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9179147Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9179684Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9180221Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9180810Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9181344Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9181890Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9182430Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9182970Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9183506Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9183614Z ok (4.094s) 2022-11-23T02:25:21.9183634Z 2022-11-23T02:25:21.9184293Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9184393Z Ran 1 test in 4.094s 2022-11-23T02:25:21.9184435Z 2022-11-23T02:25:21.9184507Z OK 2022-11-23T02:25:21.9184562Z 2022-11-23T02:25:21.9184670Z Generating XML reports... 2022-11-23T02:25:21.9185174Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021556.xml 2022-11-23T02:25:21.9185528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9185720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9186092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9186300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9186321Z 2022-11-23T02:25:21.9186435Z Running tests... 2022-11-23T02:25:21.9186704Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9187018Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9187322Z test_multiple_outputs_multiple_backward (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9187525Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49170 2022-11-23T02:25:21.9187751Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49171 2022-11-23T02:25:21.9188206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9188385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9188766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9188962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9189330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9189507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9189884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9190055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9190290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9190531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9190792Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ob8h0_a 2022-11-23T02:25:21.9191068Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ob8h0_a/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9191327Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0dvsi457 2022-11-23T02:25:21.9191599Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0dvsi457/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9191706Z ok (7.413s) 2022-11-23T02:25:21.9191726Z 2022-11-23T02:25:21.9191977Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9192094Z Ran 1 test in 7.414s 2022-11-23T02:25:21.9192113Z 2022-11-23T02:25:21.9192204Z OK 2022-11-23T02:25:21.9192223Z 2022-11-23T02:25:21.9192358Z Generating XML reports... 2022-11-23T02:25:21.9192829Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021603.xml 2022-11-23T02:25:21.9193211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9193394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9193777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9193971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9193991Z 2022-11-23T02:25:21.9194081Z Running tests... 2022-11-23T02:25:21.9194349Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9194670Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9195032Z test_multiple_outputs_multiple_backward_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9195268Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49391 2022-11-23T02:25:21.9195490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49392 2022-11-23T02:25:21.9195865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9196042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9196407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9196601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9196967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9197150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9197593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9197786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9198021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9198253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9198516Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfxmcbply 2022-11-23T02:25:21.9198770Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfxmcbply/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9199031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphecp5qtl 2022-11-23T02:25:21.9199306Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphecp5qtl/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9199415Z ok (7.402s) 2022-11-23T02:25:21.9199435Z 2022-11-23T02:25:21.9199715Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9199829Z Ran 1 test in 7.402s 2022-11-23T02:25:21.9199848Z 2022-11-23T02:25:21.9199944Z OK 2022-11-23T02:25:21.9199962Z 2022-11-23T02:25:21.9200090Z Generating XML reports... 2022-11-23T02:25:21.9200533Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021613.xml 2022-11-23T02:25:21.9200910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9201089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9201473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9201665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9201688Z 2022-11-23T02:25:21.9201800Z Running tests... 2022-11-23T02:25:21.9202069Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9202383Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9202689Z test_nccl_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9202891Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49612 2022-11-23T02:25:21.9203117Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49613 2022-11-23T02:25:21.9203491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9203666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9204040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9204289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9204663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9204832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9205186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9205379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9205616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9205842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9206100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpite0bcux 2022-11-23T02:25:21.9206366Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpite0bcux/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9206666Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_7r4s5at 2022-11-23T02:25:21.9206935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_7r4s5at/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9207172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9207387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9207494Z ok (7.313s) 2022-11-23T02:25:21.9207513Z 2022-11-23T02:25:21.9207780Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9207886Z Ran 1 test in 7.313s 2022-11-23T02:25:21.9207905Z 2022-11-23T02:25:21.9208000Z OK 2022-11-23T02:25:21.9208019Z 2022-11-23T02:25:21.9208145Z Generating XML reports... 2022-11-23T02:25:21.9208607Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021623.xml 2022-11-23T02:25:21.9208979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9209136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9209510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9209705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9209724Z 2022-11-23T02:25:21.9209829Z Running tests... 2022-11-23T02:25:21.9210087Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9210397Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9210715Z test_nccl_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9210937Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49833 2022-11-23T02:25:21.9211151Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49834 2022-11-23T02:25:21.9211503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9211682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9212060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9212254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9212691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9212783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9213236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9213402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9213620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9213849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9214100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf1szgift 2022-11-23T02:25:21.9214375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf1szgift/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9214620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpme2s88vi 2022-11-23T02:25:21.9214882Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpme2s88vi/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9215121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9215407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9215505Z ok (7.328s) 2022-11-23T02:25:21.9215525Z 2022-11-23T02:25:21.9215775Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9215969Z Ran 1 test in 7.328s 2022-11-23T02:25:21.9215998Z 2022-11-23T02:25:21.9216051Z OK 2022-11-23T02:25:21.9216051Z 2022-11-23T02:25:21.9216137Z Generating XML reports... 2022-11-23T02:25:21.9216599Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021632.xml 2022-11-23T02:25:21.9216967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9217138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9217522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9217702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9217746Z 2022-11-23T02:25:21.9217836Z Running tests... 2022-11-23T02:25:21.9218101Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9218408Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9218679Z test_nccl_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9218901Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50054 2022-11-23T02:25:21.9219123Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50055 2022-11-23T02:25:21.9219493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9219672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9220039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9220227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9220584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9220762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9221128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9221313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9221546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9221776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9222117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgixt1zkg 2022-11-23T02:25:21.9222449Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgixt1zkg/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9222702Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpamixled8 2022-11-23T02:25:21.9222976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpamixled8/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9223342Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9223685Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9223916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9224495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9224703Z ok (9.108s) 2022-11-23T02:25:21.9224781Z 2022-11-23T02:25:21.9225070Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9225164Z Ran 1 test in 9.108s 2022-11-23T02:25:21.9225207Z 2022-11-23T02:25:21.9225306Z OK 2022-11-23T02:25:21.9225306Z 2022-11-23T02:25:21.9225438Z Generating XML reports... 2022-11-23T02:25:21.9225890Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021642.xml 2022-11-23T02:25:21.9226275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9226433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9226807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9226972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9227020Z 2022-11-23T02:25:21.9227108Z Running tests... 2022-11-23T02:25:21.9227373Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9227634Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9227905Z test_nccl_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9228129Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50284 2022-11-23T02:25:21.9228432Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50285 2022-11-23T02:25:21.9228781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9228955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9229310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9229444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9229822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9229997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9230358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9230547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9230773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9231043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9231137Z skip: Need at least 8 CUDA devices (4.120s) 2022-11-23T02:25:21.9231183Z 2022-11-23T02:25:21.9231428Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9231545Z Ran 1 test in 4.120s 2022-11-23T02:25:21.9231568Z 2022-11-23T02:25:21.9231679Z OK (skipped=1) 2022-11-23T02:25:21.9231697Z 2022-11-23T02:25:21.9231906Z Generating XML reports... 2022-11-23T02:25:21.9232387Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021654.xml 2022-11-23T02:25:21.9232764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9232942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9233318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9233490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9233510Z 2022-11-23T02:25:21.9233624Z Running tests... 2022-11-23T02:25:21.9233893Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9234211Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9234568Z test_nccl_backend_multi_device_ids_not_allowed (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9234795Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50483 2022-11-23T02:25:21.9235015Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50484 2022-11-23T02:25:21.9235388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9235542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9235926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9236120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9236487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9236753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9237112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9237238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9237518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9237697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9237780Z ok (5.762s) 2022-11-23T02:25:21.9237799Z 2022-11-23T02:25:21.9238126Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9238251Z Ran 1 test in 5.762s 2022-11-23T02:25:21.9238362Z 2022-11-23T02:25:21.9238430Z OK 2022-11-23T02:25:21.9238430Z 2022-11-23T02:25:21.9238557Z Generating XML reports... 2022-11-23T02:25:21.9238984Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021700.xml 2022-11-23T02:25:21.9239354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9239609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9239901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9240172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9240172Z 2022-11-23T02:25:21.9240279Z Running tests... 2022-11-23T02:25:21.9240479Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9240866Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9241165Z test_nccl_backend_multi_device_module_device_ids_None (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9241380Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50688 2022-11-23T02:25:21.9241610Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50689 2022-11-23T02:25:21.9241985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9242140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9242517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9242713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9243078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9243255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9243684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9243882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9244116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9244325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9244587Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd_p9giaa 2022-11-23T02:25:21.9244865Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd_p9giaa/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9245124Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn0u1srsz 2022-11-23T02:25:21.9245394Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn0u1srsz/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9245753Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9246110Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:25:21.9246435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9246674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9246758Z ok (9.236s) 2022-11-23T02:25:21.9246778Z 2022-11-23T02:25:21.9247056Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9247173Z Ran 1 test in 9.236s 2022-11-23T02:25:21.9247192Z 2022-11-23T02:25:21.9247290Z OK 2022-11-23T02:25:21.9247309Z 2022-11-23T02:25:21.9247435Z Generating XML reports... 2022-11-23T02:25:21.9247898Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021709.xml 2022-11-23T02:25:21.9248281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9248461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9248841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9249013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9249033Z 2022-11-23T02:25:21.9249146Z Running tests... 2022-11-23T02:25:21.9249414Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9249728Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9250044Z test_nccl_backend_single_device_module_device_ids_None (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9250267Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50918 2022-11-23T02:25:21.9250544Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50919 2022-11-23T02:25:21.9250932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9251086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9251467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9251659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9252028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9252210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9252592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9252898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9253138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9253371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9253608Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1nz4inrn 2022-11-23T02:25:21.9253881Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1nz4inrn/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9254139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkkbz_q6x 2022-11-23T02:25:21.9254411Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkkbz_q6x/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9254650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9254887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9254998Z ok (7.315s) 2022-11-23T02:25:21.9255018Z 2022-11-23T02:25:21.9255297Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9255391Z Ran 1 test in 7.315s 2022-11-23T02:25:21.9255410Z 2022-11-23T02:25:21.9255506Z OK 2022-11-23T02:25:21.9255525Z 2022-11-23T02:25:21.9255651Z Generating XML reports... 2022-11-23T02:25:21.9256118Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021720.xml 2022-11-23T02:25:21.9256493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9256672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9257051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9257247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9257270Z 2022-11-23T02:25:21.9257385Z Running tests... 2022-11-23T02:25:21.9257638Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9257948Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9258269Z test_nccl_backend_single_device_module_empty_device_ids (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9258493Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51139 2022-11-23T02:25:21.9258714Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51140 2022-11-23T02:25:21.9259089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9259268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9259638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9259843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9260230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9260425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9260808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9261000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9261233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9261466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9261726Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplkz42ajc 2022-11-23T02:25:21.9262057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplkz42ajc/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9262293Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa8cnd6s0 2022-11-23T02:25:21.9262563Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa8cnd6s0/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9262803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9263043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9263145Z ok (7.431s) 2022-11-23T02:25:21.9263164Z 2022-11-23T02:25:21.9263504Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9263558Z Ran 1 test in 7.431s 2022-11-23T02:25:21.9263577Z 2022-11-23T02:25:21.9263672Z OK 2022-11-23T02:25:21.9263691Z 2022-11-23T02:25:21.9263796Z Generating XML reports... 2022-11-23T02:25:21.9264606Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021730.xml 2022-11-23T02:25:21.9264991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9265176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9265553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9265752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9265781Z 2022-11-23T02:25:21.9265894Z Running tests... 2022-11-23T02:25:21.9266069Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9266384Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9266652Z test_nccl_propagate_error_reason (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9266884Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51360 2022-11-23T02:25:21.9267109Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51361 2022-11-23T02:25:21.9267481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9267659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9268040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9268235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9268603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9268757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9269131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9269403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9269650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9269881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9269988Z ok (22.868s) 2022-11-23T02:25:21.9270007Z 2022-11-23T02:25:21.9270279Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9270396Z Ran 1 test in 22.868s 2022-11-23T02:25:21.9270415Z 2022-11-23T02:25:21.9270488Z OK 2022-11-23T02:25:21.9270530Z 2022-11-23T02:25:21.9270634Z Generating XML reports... 2022-11-23T02:25:21.9271097Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021740.xml 2022-11-23T02:25:21.9271467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9271730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9272109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9272304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9272323Z 2022-11-23T02:25:21.9272434Z Running tests... 2022-11-23T02:25:21.9272700Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9272990Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9273178Z test_no_grad (__main__.DistributedDataParallelTest) 2022-11-23T02:25:21.9273434Z Note: this test can be sped up by only running it on a CPU module ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9273659Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51577 2022-11-23T02:25:21.9273886Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51578 2022-11-23T02:25:21.9274258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9274438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9274825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9274996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9275367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9275545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9275923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9276119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9276352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9276585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9276848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltk6nm5_ 2022-11-23T02:25:21.9277121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltk6nm5_/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9277358Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyr6wumxz 2022-11-23T02:25:21.9277631Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyr6wumxz/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9277739Z ok (7.427s) 2022-11-23T02:25:21.9277758Z 2022-11-23T02:25:21.9278033Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9278152Z Ran 1 test in 7.427s 2022-11-23T02:25:21.9278174Z 2022-11-23T02:25:21.9278270Z OK 2022-11-23T02:25:21.9278289Z 2022-11-23T02:25:21.9278466Z Generating XML reports... 2022-11-23T02:25:21.9278942Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021805.xml 2022-11-23T02:25:21.9279289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9279470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9279848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9280044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9280063Z 2022-11-23T02:25:21.9280176Z Running tests... 2022-11-23T02:25:21.9280446Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9280762Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9281108Z test_param_layout_mismatch_error (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9281331Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51790 2022-11-23T02:25:21.9281528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51791 2022-11-23T02:25:21.9281900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9282080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9282452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9282629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9283013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9283216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9283601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9283771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9284005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9284231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9284493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptzsfowb9 2022-11-23T02:25:21.9284771Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptzsfowb9/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9285028Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr7_f7v7n 2022-11-23T02:25:21.9285301Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr7_f7v7n/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9285412Z ok (6.930s) 2022-11-23T02:25:21.9285432Z 2022-11-23T02:25:21.9285709Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9285802Z Ran 1 test in 6.930s 2022-11-23T02:25:21.9285821Z 2022-11-23T02:25:21.9285916Z OK 2022-11-23T02:25:21.9285935Z 2022-11-23T02:25:21.9286064Z Generating XML reports... 2022-11-23T02:25:21.9286527Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021815.xml 2022-11-23T02:25:21.9286902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9287082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9287462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9287661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9287730Z 2022-11-23T02:25:21.9287849Z Running tests... 2022-11-23T02:25:21.9288095Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9288413Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9288686Z test_pass_default_pg (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9288909Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52003 2022-11-23T02:25:21.9289132Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52004 2022-11-23T02:25:21.9289503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9289682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9290052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9290267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9290651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9290845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9291226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9291421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9291657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9291913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9292142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9292369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9292778Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9293257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9293286Z ok (4.102s) 2022-11-23T02:25:21.9293306Z 2022-11-23T02:25:21.9293577Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9293692Z Ran 1 test in 4.103s 2022-11-23T02:25:21.9293712Z 2022-11-23T02:25:21.9293805Z OK 2022-11-23T02:25:21.9293824Z 2022-11-23T02:25:21.9293949Z Generating XML reports... 2022-11-23T02:25:21.9294411Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021825.xml 2022-11-23T02:25:21.9294764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9294945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9295326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9295523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9295544Z 2022-11-23T02:25:21.9295656Z Running tests... 2022-11-23T02:25:21.9296011Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9296243Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9296526Z test_powerSGD_ddp_comm_hook_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9296752Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52206 2022-11-23T02:25:21.9296956Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52207 2022-11-23T02:25:21.9297411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9297603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9297993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9298187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9298555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9298787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9299114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9299284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9299574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9300125Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9300357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9300892Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9301159Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_u0rf3rk 2022-11-23T02:25:21.9301435Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_u0rf3rk/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9301695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt1k7d11e 2022-11-23T02:25:21.9301968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt1k7d11e/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9302513Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9303060Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9303599Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9304461Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9305095Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9305615Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9306203Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9306792Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9307312Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9307841Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9308316Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9308921Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9309468Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9310009Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9310116Z ok (6.962s) 2022-11-23T02:25:21.9310136Z 2022-11-23T02:25:21.9310425Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9310519Z Ran 1 test in 6.962s 2022-11-23T02:25:21.9310563Z 2022-11-23T02:25:21.9310636Z OK 2022-11-23T02:25:21.9310655Z 2022-11-23T02:25:21.9310782Z Generating XML reports... 2022-11-23T02:25:21.9311248Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021831.xml 2022-11-23T02:25:21.9311671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9311860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9312296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9312525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9312525Z 2022-11-23T02:25:21.9312578Z Running tests... 2022-11-23T02:25:21.9312822Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9313139Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9313440Z test_powerSGD_ddp_comm_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9313719Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52427 2022-11-23T02:25:21.9313942Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52428 2022-11-23T02:25:21.9314319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9314498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9314921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9315054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9315424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9315603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9315981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9316179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9316489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9316968Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9317199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9317740Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9318002Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo5frnmwa 2022-11-23T02:25:21.9318278Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo5frnmwa/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9318513Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvra1av3z 2022-11-23T02:25:21.9318786Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvra1av3z/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9319330Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9320001Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9320475Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9321022Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9321611Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9322239Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9322786Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9323328Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9323862Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9324389Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9324933Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9325464Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-11-23T02:25:21.9326057Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9326597Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:25:21.9326708Z ok (6.905s) 2022-11-23T02:25:21.9326729Z 2022-11-23T02:25:21.9327011Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9327128Z Ran 1 test in 6.905s 2022-11-23T02:25:21.9327147Z 2022-11-23T02:25:21.9327245Z OK 2022-11-23T02:25:21.9327313Z 2022-11-23T02:25:21.9327444Z Generating XML reports... 2022-11-23T02:25:21.9327919Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021841.xml 2022-11-23T02:25:21.9328299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9328457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9328839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9329038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9329058Z 2022-11-23T02:25:21.9329174Z Running tests... 2022-11-23T02:25:21.9329437Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9329755Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9330043Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9330268Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52648 2022-11-23T02:25:21.9330491Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52649 2022-11-23T02:25:21.9330843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9331024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9331413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9331607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9331979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9332157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9332541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9332800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9332948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9333177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9333437Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_vsg5mm 2022-11-23T02:25:21.9333712Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_vsg5mm/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9333969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp91ojd0u4 2022-11-23T02:25:21.9334240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp91ojd0u4/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9334534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9334782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9334887Z ok (8.412s) 2022-11-23T02:25:21.9334906Z 2022-11-23T02:25:21.9335157Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9335273Z Ran 1 test in 8.412s 2022-11-23T02:25:21.9335292Z 2022-11-23T02:25:21.9335389Z OK 2022-11-23T02:25:21.9335408Z 2022-11-23T02:25:21.9335535Z Generating XML reports... 2022-11-23T02:25:21.9336001Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021850.xml 2022-11-23T02:25:21.9336375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9336555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9336996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9337258Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9337258Z 2022-11-23T02:25:21.9337304Z Running tests... 2022-11-23T02:25:21.9337573Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9337886Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9338178Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9338403Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52869 2022-11-23T02:25:21.9338627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52870 2022-11-23T02:25:21.9338997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9339185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9339544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9339740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9340103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9340283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9340660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9340856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9341089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9341324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9341568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt4_p1byd 2022-11-23T02:25:21.9341844Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt4_p1byd/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9342102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj6zybqxl 2022-11-23T02:25:21.9342377Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj6zybqxl/_remote_module_non_scriptable.py 2022-11-23T02:25:21.9342617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9342858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:21.9342964Z ok (7.597s) 2022-11-23T02:25:21.9342983Z 2022-11-23T02:25:21.9343253Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9343347Z Ran 1 test in 7.597s 2022-11-23T02:25:21.9343396Z 2022-11-23T02:25:21.9343470Z OK 2022-11-23T02:25:21.9343489Z 2022-11-23T02:25:21.9343663Z Generating XML reports... 2022-11-23T02:25:21.9344466Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021901.xml 2022-11-23T02:25:21.9344884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9345053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9345422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9345619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9345657Z 2022-11-23T02:25:21.9345772Z Running tests... 2022-11-23T02:25:21.9345993Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9346327Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9346702Z test_invalid_nccl_blocking_wait_env (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9346914Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53090 2022-11-23T02:25:21.9347117Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53091 2022-11-23T02:25:21.9347279Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53092 2022-11-23T02:25:21.9347656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9347835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9348194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9348390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9348767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9348947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9349323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9349523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9349887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9350064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9350439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9350607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9350841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9351075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9351306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9351413Z ok (4.104s) 2022-11-23T02:25:21.9351432Z 2022-11-23T02:25:21.9351704Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9351820Z Ran 1 test in 4.104s 2022-11-23T02:25:21.9351839Z 2022-11-23T02:25:21.9351937Z OK 2022-11-23T02:25:21.9351956Z 2022-11-23T02:25:21.9352061Z Generating XML reports... 2022-11-23T02:25:21.9352587Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021911.xml 2022-11-23T02:25:21.9352870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9353054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9353508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9353714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9353733Z 2022-11-23T02:25:21.9353847Z Running tests... 2022-11-23T02:25:21.9354117Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9354432Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9354679Z test_nccl_blocking_wait_with_barrier (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9354906Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53355 2022-11-23T02:25:21.9355132Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53356 2022-11-23T02:25:21.9355352Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53357 2022-11-23T02:25:21.9355787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9355966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9356345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9356540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9356886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9357065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9357437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9357630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9358050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9358186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9358566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9358760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9358994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9359204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9359434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9359906Z [W ProcessGroupNCCL.cpp:950] [Rank 0] Found key in store: NCCLABORTEDCOMM:20a369ac1102000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, from rank: 0. This means that rank has aborted its NCCL communicators previously and is not in a healthy state.. Aborting appropriate communicators 2022-11-23T02:25:21.9360020Z ok (15.820s) 2022-11-23T02:25:21.9360039Z 2022-11-23T02:25:21.9360311Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9360431Z Ran 1 test in 15.820s 2022-11-23T02:25:21.9360450Z 2022-11-23T02:25:21.9360545Z OK 2022-11-23T02:25:21.9360565Z 2022-11-23T02:25:21.9360692Z Generating XML reports... 2022-11-23T02:25:21.9361130Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021918.xml 2022-11-23T02:25:21.9361482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9361662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9362040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9362287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9362308Z 2022-11-23T02:25:21.9362428Z Running tests... 2022-11-23T02:25:21.9362700Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9363014Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9363362Z test_nccl_errors_blocking_abort (__main__.NcclErrorHandlingTest) ... skip: Frequently times out see https://github.com/pytorch/pytorch/issues/58920 (0.001s) 2022-11-23T02:25:21.9363381Z 2022-11-23T02:25:21.9363647Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9363740Z Ran 1 test in 0.001s 2022-11-23T02:25:21.9363781Z 2022-11-23T02:25:21.9363932Z OK (skipped=1) 2022-11-23T02:25:21.9363932Z 2022-11-23T02:25:21.9364017Z Generating XML reports... 2022-11-23T02:25:21.9364511Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021936.xml 2022-11-23T02:25:21.9364881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9365062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9365444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9365642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9365662Z 2022-11-23T02:25:21.9365778Z Running tests... 2022-11-23T02:25:21.9366020Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9366335Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9366610Z test_nccl_errors_blocking_clean_exit (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9366839Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53701 2022-11-23T02:25:21.9367066Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53702 2022-11-23T02:25:21.9367285Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53703 2022-11-23T02:25:21.9367664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9367841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9368188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9368368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9368747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9368947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9369328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9369522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9369894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9370070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9370451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9370620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9370854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9371081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9371381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9371495Z ok (19.264s) 2022-11-23T02:25:21.9371516Z 2022-11-23T02:25:21.9371791Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9371910Z Ran 1 test in 19.265s 2022-11-23T02:25:21.9371929Z 2022-11-23T02:25:21.9372023Z OK 2022-11-23T02:25:21.9372042Z 2022-11-23T02:25:21.9372147Z Generating XML reports... 2022-11-23T02:25:21.9372581Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021939.xml 2022-11-23T02:25:21.9372953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9373134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9373515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9373762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9373781Z 2022-11-23T02:25:21.9373957Z Running tests... 2022-11-23T02:25:21.9374168Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9374482Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9374733Z test_nccl_errors_blocking_nonzero_exit (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9374956Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53984 2022-11-23T02:25:21.9375178Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53985 2022-11-23T02:25:21.9375397Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53986 2022-11-23T02:25:21.9375770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9375956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9376336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9376533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9376881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9377062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9377439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9377634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9378001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9378184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9378561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9378751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9378988Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9379197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9379426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9379531Z ok (18.982s) 2022-11-23T02:25:21.9379549Z 2022-11-23T02:25:21.9379819Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9380020Z Ran 1 test in 18.982s 2022-11-23T02:25:21.9380020Z 2022-11-23T02:25:21.9380055Z OK 2022-11-23T02:25:21.9380070Z 2022-11-23T02:25:21.9380204Z Generating XML reports... 2022-11-23T02:25:21.9380691Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022000.xml 2022-11-23T02:25:21.9381114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9381235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9381617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9381813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9381834Z 2022-11-23T02:25:21.9381945Z Running tests... 2022-11-23T02:25:21.9382214Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9382531Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9382797Z test_nccl_errors_blocking_sigkill (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9383070Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54267 2022-11-23T02:25:21.9383268Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54268 2022-11-23T02:25:21.9383487Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54269 2022-11-23T02:25:21.9384145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9384406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9384818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9385028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9385397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9385571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9385912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9386041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9386406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9386587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9386969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9387161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9387394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9387626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9387841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9387951Z ok (19.242s) 2022-11-23T02:25:21.9387972Z 2022-11-23T02:25:21.9388244Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9388360Z Ran 1 test in 19.242s 2022-11-23T02:25:21.9388380Z 2022-11-23T02:25:21.9388474Z OK 2022-11-23T02:25:21.9388493Z 2022-11-23T02:25:21.9388622Z Generating XML reports... 2022-11-23T02:25:21.9389060Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022022.xml 2022-11-23T02:25:21.9389433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9389615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9389967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9390291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9390313Z 2022-11-23T02:25:21.9390434Z Running tests... 2022-11-23T02:25:21.9390708Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9391025Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9391293Z test_nccl_errors_blocking_sigterm (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9391515Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54550 2022-11-23T02:25:21.9391736Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54551 2022-11-23T02:25:21.9391932Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54552 2022-11-23T02:25:21.9392305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9392552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9392938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9393132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9393501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9393712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9394059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9394255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9394591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9394772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9395146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9395341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9395579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9395805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9396033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9396139Z ok (19.046s) 2022-11-23T02:25:21.9396159Z 2022-11-23T02:25:21.9396404Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9396524Z Ran 1 test in 19.046s 2022-11-23T02:25:21.9396543Z 2022-11-23T02:25:21.9396638Z OK 2022-11-23T02:25:21.9396657Z 2022-11-23T02:25:21.9396787Z Generating XML reports... 2022-11-23T02:25:21.9397231Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022044.xml 2022-11-23T02:25:21.9397606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9397789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9398171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9398366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9398386Z 2022-11-23T02:25:21.9398475Z Running tests... 2022-11-23T02:25:21.9398746Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9399062Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9399332Z test_nccl_errors_nonblocking (__main__.NcclErrorHandlingTest) ... skip: Test does not pass when run locally (0.001s) 2022-11-23T02:25:21.9399404Z 2022-11-23T02:25:21.9399677Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9399794Z Ran 1 test in 0.001s 2022-11-23T02:25:21.9399815Z 2022-11-23T02:25:21.9399928Z OK (skipped=1) 2022-11-23T02:25:21.9399947Z 2022-11-23T02:25:21.9400075Z Generating XML reports... 2022-11-23T02:25:21.9400484Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022105.xml 2022-11-23T02:25:21.9400857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9401036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9401413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9401611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9401677Z 2022-11-23T02:25:21.9401796Z Running tests... 2022-11-23T02:25:21.9402063Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9402379Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9402623Z test_nccl_timeout (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9402823Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54898 2022-11-23T02:25:21.9403045Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54899 2022-11-23T02:25:21.9403265Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54900 2022-11-23T02:25:21.9403639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9403818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9404214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9404413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9404784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9404939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9405317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9405511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9405875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9406051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9406437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9406630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9406867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9407097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9407303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:21.9407772Z [W ProcessGroupNCCL.cpp:950] [Rank 1] Found key in store: NCCLABORTEDCOMM:20864bac1102000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, from rank: 0. This means that rank has aborted its NCCL communicators previously and is not in a healthy state.. Aborting appropriate communicators 2022-11-23T02:25:21.9408287Z [W ProcessGroupNCCL.cpp:950] [Rank 2] Found key in store: NCCLABORTEDCOMM:20864bac1102000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, from rank: 0. This means that rank has aborted its NCCL communicators previously and is not in a healthy state.. Aborting appropriate communicators 2022-11-23T02:25:21.9408407Z ok (26.750s) 2022-11-23T02:25:21.9408427Z 2022-11-23T02:25:21.9408701Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9408819Z Ran 1 test in 26.750s 2022-11-23T02:25:21.9408838Z 2022-11-23T02:25:21.9408936Z OK 2022-11-23T02:25:21.9408955Z 2022-11-23T02:25:21.9409082Z Generating XML reports... 2022-11-23T02:25:21.9409521Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022107.xml 2022-11-23T02:25:21.9409894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9410122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9410487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9410682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9410701Z 2022-11-23T02:25:21.9410812Z Running tests... 2022-11-23T02:25:21.9411080Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9411393Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9411723Z test_allgather_base (__main__.NcclProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9411946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55190 2022-11-23T02:25:21.9412320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9412480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9412867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9413060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9413294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9413548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9413952Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9414057Z ok (5.584s) 2022-11-23T02:25:21.9414076Z 2022-11-23T02:25:21.9414338Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9414448Z Ran 1 test in 5.585s 2022-11-23T02:25:21.9414469Z 2022-11-23T02:25:21.9414541Z OK 2022-11-23T02:25:21.9414563Z 2022-11-23T02:25:21.9414688Z Generating XML reports... 2022-11-23T02:25:21.9415324Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022137.xml 2022-11-23T02:25:21.9415623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9415804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9416185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9416381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9416400Z 2022-11-23T02:25:21.9416514Z Running tests... 2022-11-23T02:25:21.9416758Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9417078Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9417477Z test_allreduce_coalesced (__main__.NcclProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9417707Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55329 2022-11-23T02:25:21.9418074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9418251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9418637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9418828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9419063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9419292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9419759Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9420586Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:25:21.9420635Z warnings.warn( 2022-11-23T02:25:21.9420739Z ok (5.651s) 2022-11-23T02:25:21.9420758Z 2022-11-23T02:25:21.9421027Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9421144Z Ran 1 test in 5.651s 2022-11-23T02:25:21.9421163Z 2022-11-23T02:25:21.9421259Z OK 2022-11-23T02:25:21.9421277Z 2022-11-23T02:25:21.9421404Z Generating XML reports... 2022-11-23T02:25:21.9421937Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022145.xml 2022-11-23T02:25:21.9422414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9422594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9422972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9423162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9423182Z 2022-11-23T02:25:21.9423289Z Running tests... 2022-11-23T02:25:21.9423552Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9424111Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9424508Z test_collectives (__main__.NcclProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9424749Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55468 2022-11-23T02:25:21.9425107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9425321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9425703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9425901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9426100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9426381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9426788Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9426871Z ok (5.595s) 2022-11-23T02:25:21.9426871Z 2022-11-23T02:25:21.9427194Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9427411Z Ran 1 test in 5.596s 2022-11-23T02:25:21.9427411Z 2022-11-23T02:25:21.9427509Z OK 2022-11-23T02:25:21.9427530Z 2022-11-23T02:25:21.9427654Z Generating XML reports... 2022-11-23T02:25:21.9428205Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022153.xml 2022-11-23T02:25:21.9428572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9428749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9429124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9429293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9429312Z 2022-11-23T02:25:21.9429420Z Running tests... 2022-11-23T02:25:21.9429754Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9430068Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9430400Z test_reduce_scatter_base (__main__.NcclProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9430620Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55607 2022-11-23T02:25:21.9430991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9431165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9431525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9431718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9431947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9432201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9432604Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9432705Z ok (5.625s) 2022-11-23T02:25:21.9432724Z 2022-11-23T02:25:21.9432984Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9433136Z Ran 1 test in 5.625s 2022-11-23T02:25:21.9433136Z 2022-11-23T02:25:21.9433217Z OK 2022-11-23T02:25:21.9433236Z 2022-11-23T02:25:21.9433342Z Generating XML reports... 2022-11-23T02:25:21.9433896Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022201.xml 2022-11-23T02:25:21.9434270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9434455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9434844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9435043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9435063Z 2022-11-23T02:25:21.9435176Z Running tests... 2022-11-23T02:25:21.9435442Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9435735Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9436011Z test_init_no_gpus (__main__.ProcessGroupNCCLNoGPUTest) ... skip: GPUs are available, skipping test (0.001s) 2022-11-23T02:25:21.9436030Z 2022-11-23T02:25:21.9436291Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9436410Z Ran 1 test in 0.001s 2022-11-23T02:25:21.9436429Z 2022-11-23T02:25:21.9436545Z OK (skipped=1) 2022-11-23T02:25:21.9436564Z 2022-11-23T02:25:21.9436692Z Generating XML reports... 2022-11-23T02:25:21.9437192Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLNoGPUTest-20221123022209.xml 2022-11-23T02:25:21.9437578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9437861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9438314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9438421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9438441Z 2022-11-23T02:25:21.9438555Z Running tests... 2022-11-23T02:25:21.9438850Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9439136Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9439449Z test_allgather_base_basics (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9439674Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55811 2022-11-23T02:25:21.9439896Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55812 2022-11-23T02:25:21.9440274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9440432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9440814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9441010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9441378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9441657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9442036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9442230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9442465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9442693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9443068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9443317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9443727Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9444126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9444242Z ok (5.700s) 2022-11-23T02:25:21.9444262Z 2022-11-23T02:25:21.9444626Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9444657Z Ran 1 test in 5.700s 2022-11-23T02:25:21.9444676Z 2022-11-23T02:25:21.9444777Z OK 2022-11-23T02:25:21.9444796Z 2022-11-23T02:25:21.9444900Z Generating XML reports... 2022-11-23T02:25:21.9445336Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022211.xml 2022-11-23T02:25:21.9445712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9445892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9446277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9446478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9446498Z 2022-11-23T02:25:21.9446659Z Running tests... 2022-11-23T02:25:21.9446939Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9447232Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9447491Z test_allgather_base_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9447712Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56016 2022-11-23T02:25:21.9447937Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56017 2022-11-23T02:25:21.9448313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9448494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9448876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9449125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9449500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9449654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9450031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9450226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9450463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9450716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9450952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9451199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9451608Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9452024Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9452099Z ok (6.909s) 2022-11-23T02:25:21.9452118Z 2022-11-23T02:25:21.9452387Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9452582Z Ran 1 test in 6.909s 2022-11-23T02:25:21.9452582Z 2022-11-23T02:25:21.9452626Z OK 2022-11-23T02:25:21.9452645Z 2022-11-23T02:25:21.9452772Z Generating XML reports... 2022-11-23T02:25:21.9453209Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022220.xml 2022-11-23T02:25:21.9453579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9453768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9454129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9454327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9454347Z 2022-11-23T02:25:21.9454462Z Running tests... 2022-11-23T02:25:21.9454736Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9455051Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9455301Z test_allgather_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9455525Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56229 2022-11-23T02:25:21.9455745Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56230 2022-11-23T02:25:21.9456143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9456330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9456712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9456912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9457275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9457454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9457833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9458023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9458316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9458545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9458776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9459019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9459424Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9459826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9459934Z ok (6.873s) 2022-11-23T02:25:21.9459953Z 2022-11-23T02:25:21.9460224Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9460342Z Ran 1 test in 6.873s 2022-11-23T02:25:21.9460364Z 2022-11-23T02:25:21.9460460Z OK 2022-11-23T02:25:21.9460478Z 2022-11-23T02:25:21.9460587Z Generating XML reports... 2022-11-23T02:25:21.9461021Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022229.xml 2022-11-23T02:25:21.9461390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9461568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9461952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9462151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9462172Z 2022-11-23T02:25:21.9462286Z Running tests... 2022-11-23T02:25:21.9462552Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9462843Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9463099Z test_allreduce_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9463322Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56442 2022-11-23T02:25:21.9463543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56443 2022-11-23T02:25:21.9464224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9464355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9464903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9465130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9465491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9465657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9466103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9466315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9466473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9466713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9466947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9467192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9467604Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9468063Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9468173Z ok (6.907s) 2022-11-23T02:25:21.9468193Z 2022-11-23T02:25:21.9468465Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9468585Z Ran 1 test in 6.907s 2022-11-23T02:25:21.9468604Z 2022-11-23T02:25:21.9468697Z OK 2022-11-23T02:25:21.9468716Z 2022-11-23T02:25:21.9468842Z Generating XML reports... 2022-11-23T02:25:21.9469278Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022238.xml 2022-11-23T02:25:21.9469651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9469829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9470191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9470393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9470417Z 2022-11-23T02:25:21.9470528Z Running tests... 2022-11-23T02:25:21.9470800Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9471115Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9471353Z test_barrier (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9471575Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56655 2022-11-23T02:25:21.9471800Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56656 2022-11-23T02:25:21.9472150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9472332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9472711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9472917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9473285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9473463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9473844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9474045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9474276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9474502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9474732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9475028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9475441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9475844Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9475952Z ok (6.922s) 2022-11-23T02:25:21.9475971Z 2022-11-23T02:25:21.9476240Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9476357Z Ran 1 test in 6.923s 2022-11-23T02:25:21.9476376Z 2022-11-23T02:25:21.9476448Z OK 2022-11-23T02:25:21.9476489Z 2022-11-23T02:25:21.9476596Z Generating XML reports... 2022-11-23T02:25:21.9477027Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022248.xml 2022-11-23T02:25:21.9477460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9477642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9478023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9478222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9478241Z 2022-11-23T02:25:21.9478355Z Running tests... 2022-11-23T02:25:21.9478622Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9478915Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9479168Z test_broadcast_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9479395Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56868 2022-11-23T02:25:21.9479618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56869 2022-11-23T02:25:21.9480001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9480182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9480563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9480757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9481100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9481279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9481697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9481860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9482100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9482351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9482580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9482827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9483230Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9483605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9483713Z ok (6.890s) 2022-11-23T02:25:21.9483732Z 2022-11-23T02:25:21.9483998Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9484121Z Ran 1 test in 6.890s 2022-11-23T02:25:21.9484140Z 2022-11-23T02:25:21.9484235Z OK 2022-11-23T02:25:21.9484254Z 2022-11-23T02:25:21.9484443Z Generating XML reports... 2022-11-23T02:25:21.9484892Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022257.xml 2022-11-23T02:25:21.9485266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9485445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9485805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9485999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9486017Z 2022-11-23T02:25:21.9486130Z Running tests... 2022-11-23T02:25:21.9486404Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9486722Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9487023Z test_empty_tensors (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9487249Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57081 2022-11-23T02:25:21.9487470Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57082 2022-11-23T02:25:21.9487821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9488002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9488370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9488552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9488932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9489133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9489516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9489712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9489922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9490174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9490403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9490644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9491049Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9491455Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9491562Z ok (6.899s) 2022-11-23T02:25:21.9491581Z 2022-11-23T02:25:21.9491852Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9491993Z Ran 1 test in 6.900s 2022-11-23T02:25:21.9491993Z 2022-11-23T02:25:21.9492062Z OK 2022-11-23T02:25:21.9492104Z 2022-11-23T02:25:21.9492208Z Generating XML reports... 2022-11-23T02:25:21.9492638Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022306.xml 2022-11-23T02:25:21.9493008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9493186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9493568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9493813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9493834Z 2022-11-23T02:25:21.9493952Z Running tests... 2022-11-23T02:25:21.9494265Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9494560Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9494830Z test_gather_checks (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9495035Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57294 2022-11-23T02:25:21.9495255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57295 2022-11-23T02:25:21.9495625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9495806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9496242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9496436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9496783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9496965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9497342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9497538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9497773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9498023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9498252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9498504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9498910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9499305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9499393Z ok (5.696s) 2022-11-23T02:25:21.9499413Z 2022-11-23T02:25:21.9499679Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9499796Z Ran 1 test in 5.696s 2022-11-23T02:25:21.9499816Z 2022-11-23T02:25:21.9499910Z OK 2022-11-23T02:25:21.9499929Z 2022-11-23T02:25:21.9500056Z Generating XML reports... 2022-11-23T02:25:21.9500486Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022316.xml 2022-11-23T02:25:21.9500863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9501019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9501400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9501594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9501614Z 2022-11-23T02:25:21.9501728Z Running tests... 2022-11-23T02:25:21.9501996Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9502310Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9502553Z test_gather_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9502781Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57499 2022-11-23T02:25:21.9503011Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57500 2022-11-23T02:25:21.9503409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9503595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9504236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9504502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9504880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9505069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9505468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9505673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9505964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9506218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9506439Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9506665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9507114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9507474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9507583Z ok (6.998s) 2022-11-23T02:25:21.9507605Z 2022-11-23T02:25:21.9507873Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9507995Z Ran 1 test in 6.998s 2022-11-23T02:25:21.9508014Z 2022-11-23T02:25:21.9508086Z OK 2022-11-23T02:25:21.9508108Z 2022-11-23T02:25:21.9508240Z Generating XML reports... 2022-11-23T02:25:21.9508666Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022324.xml 2022-11-23T02:25:21.9509041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9509222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9509603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9509798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9509817Z 2022-11-23T02:25:21.9509929Z Running tests... 2022-11-23T02:25:21.9510197Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9510561Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9510751Z test_gather_stress (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9510973Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57716 2022-11-23T02:25:21.9511199Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57717 2022-11-23T02:25:21.9511571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9511747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9512123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9512321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9512667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9512917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9513313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9513506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9513823Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9513995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9514223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9514472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9514877Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9515306Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9515417Z ok (11.633s) 2022-11-23T02:25:21.9515436Z 2022-11-23T02:25:21.9515701Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9515920Z Ran 1 test in 11.633s 2022-11-23T02:25:21.9515934Z 2022-11-23T02:25:21.9515936Z OK 2022-11-23T02:25:21.9515956Z 2022-11-23T02:25:21.9516086Z Generating XML reports... 2022-11-23T02:25:21.9516521Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022333.xml 2022-11-23T02:25:21.9516894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9517049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9517430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9517633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9517652Z 2022-11-23T02:25:21.9517762Z Running tests... 2022-11-23T02:25:21.9518034Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9518348Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9518610Z test_nccl_dist_backend_error (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9518834Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57933 2022-11-23T02:25:21.9519056Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57934 2022-11-23T02:25:21.9519404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9519585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9519974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9520172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9520540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9520720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9521099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9521292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9521504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9521754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9521983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9522365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9522784Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9523181Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9523287Z ok (5.612s) 2022-11-23T02:25:21.9523307Z 2022-11-23T02:25:21.9523575Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9523690Z Ran 1 test in 5.612s 2022-11-23T02:25:21.9523709Z 2022-11-23T02:25:21.9523781Z OK 2022-11-23T02:25:21.9523800Z 2022-11-23T02:25:21.9523929Z Generating XML reports... 2022-11-23T02:25:21.9524364Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022348.xml 2022-11-23T02:25:21.9524792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9524973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9525347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9525539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9525561Z 2022-11-23T02:25:21.9525673Z Running tests... 2022-11-23T02:25:21.9525916Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9526233Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9526476Z test_reduce_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9526699Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58141 2022-11-23T02:25:21.9526933Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58142 2022-11-23T02:25:21.9527307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9527487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9527869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9528061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9528407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9528588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9528968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9529167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9529405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9529655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9529882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9530122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9530501Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9530897Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9531004Z ok (7.014s) 2022-11-23T02:25:21.9531024Z 2022-11-23T02:25:21.9531292Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9531496Z Ran 1 test in 7.014s 2022-11-23T02:25:21.9531496Z 2022-11-23T02:25:21.9531576Z OK 2022-11-23T02:25:21.9531597Z 2022-11-23T02:25:21.9531731Z Generating XML reports... 2022-11-23T02:25:21.9532170Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022356.xml 2022-11-23T02:25:21.9532543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9532700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9533081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9533275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9533294Z 2022-11-23T02:25:21.9533405Z Running tests... 2022-11-23T02:25:21.9533741Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9534086Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9534351Z test_reduce_scatter_base_basics (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9534574Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58354 2022-11-23T02:25:21.9534773Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58355 2022-11-23T02:25:21.9535147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9535319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9535696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9535885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9536247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9536431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9536803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9537001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9537215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9537449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9537700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9537951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9538353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9538753Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9538861Z ok (5.592s) 2022-11-23T02:25:21.9538881Z 2022-11-23T02:25:21.9539141Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9539258Z Ran 1 test in 5.592s 2022-11-23T02:25:21.9539277Z 2022-11-23T02:25:21.9539350Z OK 2022-11-23T02:25:21.9539454Z 2022-11-23T02:25:21.9539497Z Generating XML reports... 2022-11-23T02:25:21.9539931Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022405.xml 2022-11-23T02:25:21.9540301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9540480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9540862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9541108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9541129Z 2022-11-23T02:25:21.9541251Z Running tests... 2022-11-23T02:25:21.9541498Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9541818Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9542082Z test_reduce_scatter_base_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9542308Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58559 2022-11-23T02:25:21.9542531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58560 2022-11-23T02:25:21.9542905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9543132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9543512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9543704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9544307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9544574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9544974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9545157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9545405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9545650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9545891Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9546052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9546502Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9546913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9547018Z ok (6.932s) 2022-11-23T02:25:21.9547039Z 2022-11-23T02:25:21.9547300Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9547419Z Ran 1 test in 6.932s 2022-11-23T02:25:21.9547438Z 2022-11-23T02:25:21.9547528Z OK 2022-11-23T02:25:21.9547547Z 2022-11-23T02:25:21.9547670Z Generating XML reports... 2022-11-23T02:25:21.9548101Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022413.xml 2022-11-23T02:25:21.9548475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9548633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9549013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9549206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9549225Z 2022-11-23T02:25:21.9549340Z Running tests... 2022-11-23T02:25:21.9549604Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9549919Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9550174Z test_reduce_scatter_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9550401Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58772 2022-11-23T02:25:21.9550693Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58773 2022-11-23T02:25:21.9551089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9551267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9551641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9551835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9552207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9552385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9552757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9553044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9553234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9553483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9553712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9553949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9554352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9554749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9554857Z ok (6.914s) 2022-11-23T02:25:21.9554876Z 2022-11-23T02:25:21.9555147Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9555270Z Ran 1 test in 6.915s 2022-11-23T02:25:21.9555290Z 2022-11-23T02:25:21.9555365Z OK 2022-11-23T02:25:21.9555383Z 2022-11-23T02:25:21.9555515Z Generating XML reports... 2022-11-23T02:25:21.9555950Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022423.xml 2022-11-23T02:25:21.9556321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9556500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9556875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9557067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9557087Z 2022-11-23T02:25:21.9557198Z Running tests... 2022-11-23T02:25:21.9557447Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9557767Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9558021Z test_scatter_checks (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9558242Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58985 2022-11-23T02:25:21.9558464Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58986 2022-11-23T02:25:21.9558836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9559017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9559397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9559586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9559978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9560160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9560542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9560731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9561061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9561275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9561541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9561713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9562078Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9562581Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9562643Z ok (5.586s) 2022-11-23T02:25:21.9562663Z 2022-11-23T02:25:21.9562934Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9563053Z Ran 1 test in 5.586s 2022-11-23T02:25:21.9563072Z 2022-11-23T02:25:21.9563167Z OK 2022-11-23T02:25:21.9563186Z 2022-11-23T02:25:21.9563315Z Generating XML reports... 2022-11-23T02:25:21.9563750Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022432.xml 2022-11-23T02:25:21.9564120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9564277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9564669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9564865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9564885Z 2022-11-23T02:25:21.9564996Z Running tests... 2022-11-23T02:25:21.9565262Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9565576Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9565822Z test_scatter_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9566047Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59190 2022-11-23T02:25:21.9566247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59191 2022-11-23T02:25:21.9566681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9566807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9567196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9567391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9567832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9568003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9568320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9568511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9568723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9568972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9569329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9569509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9569913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9570314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9570415Z ok (6.923s) 2022-11-23T02:25:21.9570513Z 2022-11-23T02:25:21.9570702Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9570910Z Ran 1 test in 6.923s 2022-11-23T02:25:21.9570954Z 2022-11-23T02:25:21.9571113Z OK 2022-11-23T02:25:21.9571131Z 2022-11-23T02:25:21.9571176Z Generating XML reports... 2022-11-23T02:25:21.9571613Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022440.xml 2022-11-23T02:25:21.9572132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9572326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9572677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9572865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9572865Z 2022-11-23T02:25:21.9573013Z Running tests... 2022-11-23T02:25:21.9573186Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9573579Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9573759Z test_scatter_stress (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9573992Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59407 2022-11-23T02:25:21.9574210Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59408 2022-11-23T02:25:21.9574583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9574762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9575142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9575314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9575682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9575858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9576235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9576435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9576674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9576924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9577150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9577394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9577776Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9578175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9578281Z ok (11.518s) 2022-11-23T02:25:21.9578303Z 2022-11-23T02:25:21.9578615Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9578734Z Ran 1 test in 11.518s 2022-11-23T02:25:21.9578753Z 2022-11-23T02:25:21.9578845Z OK 2022-11-23T02:25:21.9578864Z 2022-11-23T02:25:21.9578995Z Generating XML reports... 2022-11-23T02:25:21.9579431Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022449.xml 2022-11-23T02:25:21.9579800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9579957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9580336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9580537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9580556Z 2022-11-23T02:25:21.9580667Z Running tests... 2022-11-23T02:25:21.9580983Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9581302Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9581541Z test_send_recv (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9581761Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59624 2022-11-23T02:25:21.9581958Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59625 2022-11-23T02:25:21.9582352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9582532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9582915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9583104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9583472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9583644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9584271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9584517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9584761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:21.9584991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:21.9585208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:21.9585484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9585896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9586288Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:21.9586401Z ok (5.691s) 2022-11-23T02:25:21.9586438Z 2022-11-23T02:25:21.9586692Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9586780Z Ran 1 test in 5.691s 2022-11-23T02:25:21.9586819Z 2022-11-23T02:25:21.9586896Z OK 2022-11-23T02:25:21.9586944Z 2022-11-23T02:25:21.9587074Z Generating XML reports... 2022-11-23T02:25:21.9587422Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022503.xml 2022-11-23T02:25:21.9587792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9587974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9588432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9588639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9588660Z 2022-11-23T02:25:21.9588775Z Running tests... 2022-11-23T02:25:21.9589022Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9589337Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9589567Z test_common_errors (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9589810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9590287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9590453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9590917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9591168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9591564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9591783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9592176Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9592285Z ok (1.720s) 2022-11-23T02:25:21.9592304Z 2022-11-23T02:25:21.9592569Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9592688Z Ran 1 test in 1.721s 2022-11-23T02:25:21.9592710Z 2022-11-23T02:25:21.9592806Z OK 2022-11-23T02:25:21.9592825Z 2022-11-23T02:25:21.9592956Z Generating XML reports... 2022-11-23T02:25:21.9593375Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-RendezvousEnvTest-20221123022512.xml 2022-11-23T02:25:21.9593723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:21.9593909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:21.9594293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:21.9594592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:21.9594610Z 2022-11-23T02:25:21.9594825Z Running tests... 2022-11-23T02:25:21.9594993Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9595304Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-11-23T02:25:21.9595551Z test_default_store_timeout_nccl (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:21.9595800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9596180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9596429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:21.9596819Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:21.9596923Z ok (4.703s) 2022-11-23T02:25:21.9596942Z 2022-11-23T02:25:21.9597206Z ---------------------------------------------------------------------- 2022-11-23T02:25:21.9597322Z Ran 1 test in 4.703s 2022-11-23T02:25:21.9597341Z 2022-11-23T02:25:21.9597435Z OK 2022-11-23T02:25:21.9597457Z 2022-11-23T02:25:21.9597587Z Generating XML reports... 2022-11-23T02:25:21.9598011Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-TimeoutTest-20221123022516.xml 2022-11-23T02:25:21.9598058Z 2022-11-23T02:25:21.9598667Z ##[endgroup] 2022-11-23T02:25:21.9599107Z FINISHED PRINTING LOG FILE of distributed/test_c10d_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_nccl_vtevv4rw) 2022-11-23T02:25:21.9599127Z 2022-11-23T02:25:22.1766700Z 2022-11-23T02:25:22.1767110Z real 20m13.871s 2022-11-23T02:25:22.1767314Z user 31m20.767s 2022-11-23T02:25:22.1767433Z sys 24m38.309s 2022-11-23T02:25:22.1767890Z + python test/run_test.py --verbose -i distributed/test_c10d_spawn_gloo 2022-11-23T02:25:24.5987238Z Ignoring disabled issues: [] 2022-11-23T02:25:24.6507943Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:25:24.6508911Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:25:24.6509263Z Selected tests: 2022-11-23T02:25:24.6509550Z distributed/test_c10d_spawn_gloo 2022-11-23T02:25:24.6532474Z Prioritized test from test file changes. 2022-11-23T02:25:24.6532789Z reordering tests for PR: 2022-11-23T02:25:24.6533094Z prioritized: [] 2022-11-23T02:25:24.6533732Z the rest: ['distributed/test_c10d_spawn_gloo'] 2022-11-23T02:25:24.6533900Z 2022-11-23T02:25:24.6534374Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:25:24.6535297Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:25:24.6540334Z parallel (file granularity) tests: 2022-11-23T02:25:24.6540671Z 2022-11-23T02:25:24.6540967Z serial (file granularity) tests: 2022-11-23T02:25:24.6541172Z distributed/test_c10d_spawn_gloo 2022-11-23T02:25:26.9673964Z Ignoring disabled issues: [] 2022-11-23T02:25:27.3765336Z Running distributed/test_c10d_spawn_gloo ... [2022-11-23 02:25:27.375984] 2022-11-23T02:25:27.3766719Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:25:27.376440] 2022-11-23T02:27:02.7240653Z 2022-11-23T02:27:02.7241337Z Expand the folded group to see the log file of distributed/test_c10d_spawn_gloo 2022-11-23T02:27:02.7244514Z ##[group]PRINTING LOG FILE of distributed/test_c10d_spawn_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_gloo_okbk7ytk) 2022-11-23T02:27:02.7245123Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptct6svgt 2022-11-23T02:27:02.7245607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptct6svgt/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7246329Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7247121Z , <__main__.DistributedDataParallelSingleProcessTest testMethod=test_cuda>, <__main__.DistributedDataParallelSingleProcessTest testMethod=test_rnn>]> 2022-11-23T02:27:02.7247830Z test_cpu (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:27:02.7248374Z test_cuda (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:27:02.7248757Z test_rnn (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:27:02.7249253Z 2022-11-23T02:27:02.7249555Z 2022-11-23T02:27:02.7250946Z , <__main__.TestDistributedNNFunctionsGloo testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_gather>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_scatter>]> 2022-11-23T02:27:02.7252227Z test_all_gather (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7252614Z test_all_to_all (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7252983Z test_all_to_all_single (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7253495Z test_allreduce (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7253906Z test_broadcast (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7254297Z test_gather (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7254580Z test_reduce (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7254988Z test_scatter (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:27:02.7260583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7261041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7261695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7262168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7262666Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9i7w754o 2022-11-23T02:27:02.7263204Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9i7w754o/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7263657Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7263843Z 2022-11-23T02:27:02.7264322Z Running tests... 2022-11-23T02:27:02.7264746Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7265318Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7265956Z test_cpu (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:27:02.7266448Z ok (0.026s) 2022-11-23T02:27:02.7266584Z 2022-11-23T02:27:02.7266855Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7267207Z Ran 1 test in 0.026s 2022-11-23T02:27:02.7267368Z 2022-11-23T02:27:02.7267482Z OK 2022-11-23T02:27:02.7267598Z 2022-11-23T02:27:02.7267726Z Generating XML reports... 2022-11-23T02:27:02.7268423Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022534.xml 2022-11-23T02:27:02.7269213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7269694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7270268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7270756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7271232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6ykdzz8m 2022-11-23T02:27:02.7271766Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6ykdzz8m/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7272147Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7272311Z 2022-11-23T02:27:02.7272427Z Running tests... 2022-11-23T02:27:02.7273075Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7273610Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7274241Z test_cuda (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:27:02.7274725Z ok (0.492s) 2022-11-23T02:27:02.7275045Z 2022-11-23T02:27:02.7275342Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7275666Z Ran 1 test in 0.492s 2022-11-23T02:27:02.7275830Z 2022-11-23T02:27:02.7275927Z OK 2022-11-23T02:27:02.7276068Z 2022-11-23T02:27:02.7276199Z Generating XML reports... 2022-11-23T02:27:02.7276907Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022538.xml 2022-11-23T02:27:02.7277675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7278136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7278832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7279292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7279862Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_s8i4ta 2022-11-23T02:27:02.7280404Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_s8i4ta/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7280814Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7281018Z 2022-11-23T02:27:02.7281130Z Running tests... 2022-11-23T02:27:02.7281545Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7282093Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7282692Z test_rnn (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:27:02.7283166Z ok (1.311s) 2022-11-23T02:27:02.7283318Z 2022-11-23T02:27:02.7283584Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7283901Z Ran 1 test in 1.312s 2022-11-23T02:27:02.7284065Z 2022-11-23T02:27:02.7284167Z OK 2022-11-23T02:27:02.7284306Z 2022-11-23T02:27:02.7284434Z Generating XML reports... 2022-11-23T02:27:02.7285127Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022543.xml 2022-11-23T02:27:02.7285895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7286357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7286939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7287394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7287868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzi90ejjg 2022-11-23T02:27:02.7288416Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzi90ejjg/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7288857Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7289034Z 2022-11-23T02:27:02.7289146Z Running tests... 2022-11-23T02:27:02.7289557Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7290106Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7290664Z test_all_gather (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60424 2022-11-23T02:27:02.7291219Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60425 2022-11-23T02:27:02.7291833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7292292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7292852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7293418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7294016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7294463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7295023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7295492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7296022Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy9xiuctm 2022-11-23T02:27:02.7296493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy9xiuctm/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7297033Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfu1fc6l8 2022-11-23T02:27:02.7297646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfu1fc6l8/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7298076Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7298469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7298866Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7299276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7299754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7300259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7300925Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7301625Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7302011Z ok (5.585s) 2022-11-23T02:27:02.7302166Z 2022-11-23T02:27:02.7302440Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7302867Z Ran 1 test in 5.586s 2022-11-23T02:27:02.7303031Z 2022-11-23T02:27:02.7303104Z OK 2022-11-23T02:27:02.7303241Z 2022-11-23T02:27:02.7303367Z Generating XML reports... 2022-11-23T02:27:02.7304475Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022548.xml 2022-11-23T02:27:02.7305225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7305628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7306162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7306645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7307098Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphjnld3g7 2022-11-23T02:27:02.7307650Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphjnld3g7/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7308085Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7308288Z 2022-11-23T02:27:02.7308403Z Running tests... 2022-11-23T02:27:02.7308793Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7309335Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7309917Z test_all_to_all (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60640 2022-11-23T02:27:02.7310459Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60641 2022-11-23T02:27:02.7311051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7311617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7312216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7312672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7313256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7313708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7314291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7314749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7315221Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4pectzz8 2022-11-23T02:27:02.7315858Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4pectzz8/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7316377Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv6vau1s_ 2022-11-23T02:27:02.7316920Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv6vau1s_/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7317354Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7317769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7318140Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7318549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7319047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7319524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7320206Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7320896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7321295Z ok (5.486s) 2022-11-23T02:27:02.7321447Z 2022-11-23T02:27:02.7321695Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7322033Z Ran 1 test in 5.487s 2022-11-23T02:27:02.7322199Z 2022-11-23T02:27:02.7322297Z OK 2022-11-23T02:27:02.7322435Z 2022-11-23T02:27:02.7322539Z Generating XML reports... 2022-11-23T02:27:02.7323179Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022558.xml 2022-11-23T02:27:02.7323917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7324388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7324954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7325427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7325899Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpudpu82ba 2022-11-23T02:27:02.7326443Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpudpu82ba/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7326855Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7327056Z 2022-11-23T02:27:02.7327169Z Running tests... 2022-11-23T02:27:02.7327686Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7328115Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7328703Z test_all_to_all_single (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60856 2022-11-23T02:27:02.7329316Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60857 2022-11-23T02:27:02.7329941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7330376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7330961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7331435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7331993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7332445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7333018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7333557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7334006Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl3wsvtfc 2022-11-23T02:27:02.7334557Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl3wsvtfc/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7335095Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3pchipzr 2022-11-23T02:27:02.7335636Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3pchipzr/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7336050Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7336380Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7336791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7337247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7337743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7338413Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7338958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7339593Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7339991Z ok (5.389s) 2022-11-23T02:27:02.7340144Z 2022-11-23T02:27:02.7340419Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7340738Z Ran 1 test in 5.389s 2022-11-23T02:27:02.7340902Z 2022-11-23T02:27:02.7341085Z OK 2022-11-23T02:27:02.7341137Z 2022-11-23T02:27:02.7341267Z Generating XML reports... 2022-11-23T02:27:02.7341893Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022608.xml 2022-11-23T02:27:02.7342647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7343103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7343690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7344400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7344837Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ewww6vn 2022-11-23T02:27:02.7345440Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ewww6vn/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7345877Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7346058Z 2022-11-23T02:27:02.7346169Z Running tests... 2022-11-23T02:27:02.7346591Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7347284Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7347860Z test_allreduce (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61072 2022-11-23T02:27:02.7348408Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61073 2022-11-23T02:27:02.7349024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7349481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7350037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7351228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7351734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7352256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7352838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7353308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7353786Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpps64ghdq 2022-11-23T02:27:02.7354314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpps64ghdq/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7354848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9ewfvkpm 2022-11-23T02:27:02.7355387Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9ewfvkpm/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7355823Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7356128Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7356548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7357029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7357504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7358005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7358674Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7359368Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7359747Z ok (5.588s) 2022-11-23T02:27:02.7359896Z 2022-11-23T02:27:02.7360169Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7360504Z Ran 1 test in 5.588s 2022-11-23T02:27:02.7360675Z 2022-11-23T02:27:02.7360748Z OK 2022-11-23T02:27:02.7360885Z 2022-11-23T02:27:02.7361018Z Generating XML reports... 2022-11-23T02:27:02.7361666Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022617.xml 2022-11-23T02:27:02.7362412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7362848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7363426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7363904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7364355Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpti17ahnb 2022-11-23T02:27:02.7364903Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpti17ahnb/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7365341Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7365624Z 2022-11-23T02:27:02.7365725Z Running tests... 2022-11-23T02:27:02.7366121Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7366666Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7367249Z test_broadcast (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61288 2022-11-23T02:27:02.7367774Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61289 2022-11-23T02:27:02.7368394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7368855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7369433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7369962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7370548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7371001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7371582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7372031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7372507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa011l5zv 2022-11-23T02:27:02.7373052Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa011l5zv/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7373570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpudpwvwiw 2022-11-23T02:27:02.7374125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpudpwvwiw/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7374563Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7374980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7375357Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7375767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7376265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7376743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7377412Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7378106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7378591Z ok (5.486s) 2022-11-23T02:27:02.7378726Z 2022-11-23T02:27:02.7379009Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7379351Z Ran 1 test in 5.486s 2022-11-23T02:27:02.7379519Z 2022-11-23T02:27:02.7379616Z OK 2022-11-23T02:27:02.7379751Z 2022-11-23T02:27:02.7379857Z Generating XML reports... 2022-11-23T02:27:02.7380507Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022627.xml 2022-11-23T02:27:02.7381253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7381712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7382275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7382751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7383315Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphzvdr9h5 2022-11-23T02:27:02.7384046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphzvdr9h5/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7384640Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7384868Z 2022-11-23T02:27:02.7384975Z Running tests... 2022-11-23T02:27:02.7385406Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7385912Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7386424Z test_gather (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61504 2022-11-23T02:27:02.7386965Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61505 2022-11-23T02:27:02.7387556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7388108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7388694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7389166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7389728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7390180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7390759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7391226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7391669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp04mnxzip 2022-11-23T02:27:02.7392213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp04mnxzip/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7392746Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa0fp3uyt 2022-11-23T02:27:02.7393263Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa0fp3uyt/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7393690Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7394095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7394483Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7394870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7395368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7395869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7396580Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7397221Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7397623Z ok (5.587s) 2022-11-23T02:27:02.7397771Z 2022-11-23T02:27:02.7398041Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7398358Z Ran 1 test in 5.588s 2022-11-23T02:27:02.7398520Z 2022-11-23T02:27:02.7398617Z OK 2022-11-23T02:27:02.7398758Z 2022-11-23T02:27:02.7398886Z Generating XML reports... 2022-11-23T02:27:02.7399510Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022636.xml 2022-11-23T02:27:02.7400254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7400705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7401365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7401834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7402307Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphufkx_4s 2022-11-23T02:27:02.7402857Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphufkx_4s/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7403294Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7403472Z 2022-11-23T02:27:02.7403582Z Running tests... 2022-11-23T02:27:02.7403992Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7404545Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7405100Z test_reduce (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61720 2022-11-23T02:27:02.7405730Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61721 2022-11-23T02:27:02.7406346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7406800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7407357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7407832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7408416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7408848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7409423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7409965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7410378Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjnco8f0p 2022-11-23T02:27:02.7410905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjnco8f0p/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7411445Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsa_xqd37 2022-11-23T02:27:02.7411987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsa_xqd37/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7412394Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7412726Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7413135Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7413610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7414081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7414595Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7415263Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7415937Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7416338Z ok (5.489s) 2022-11-23T02:27:02.7416488Z 2022-11-23T02:27:02.7416761Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7417098Z Ran 1 test in 5.489s 2022-11-23T02:27:02.7417240Z 2022-11-23T02:27:02.7417337Z OK 2022-11-23T02:27:02.7417474Z 2022-11-23T02:27:02.7417601Z Generating XML reports... 2022-11-23T02:27:02.7418251Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022646.xml 2022-11-23T02:27:02.7419043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7419512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7420096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7420566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7421012Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps_f6sref 2022-11-23T02:27:02.7421557Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps_f6sref/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7421988Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7422189Z 2022-11-23T02:27:02.7422302Z Running tests... 2022-11-23T02:27:02.7422700Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7423316Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:27:02.7424083Z test_scatter (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61936 2022-11-23T02:27:02.7424626Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61937 2022-11-23T02:27:02.7425313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7425709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7426283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7426733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7427313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:02.7427765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:02.7428374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:02.7428849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:02.7429319Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7km4ufdl 2022-11-23T02:27:02.7429861Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7km4ufdl/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7430376Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzfyr5qy4 2022-11-23T02:27:02.7430916Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzfyr5qy4/_remote_module_non_scriptable.py 2022-11-23T02:27:02.7431355Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7431747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:02.7432152Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:02.7432563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:02.7433056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:02.7433538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:02.7434308Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7434899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:02.7435294Z ok (5.688s) 2022-11-23T02:27:02.7435423Z 2022-11-23T02:27:02.7435696Z ---------------------------------------------------------------------- 2022-11-23T02:27:02.7436032Z Ran 1 test in 5.688s 2022-11-23T02:27:02.7436195Z 2022-11-23T02:27:02.7436297Z OK 2022-11-23T02:27:02.7436437Z 2022-11-23T02:27:02.7436544Z Generating XML reports... 2022-11-23T02:27:02.7437279Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022656.xml 2022-11-23T02:27:02.7437683Z 2022-11-23T02:27:02.7438300Z ##[endgroup] 2022-11-23T02:27:02.7438881Z FINISHED PRINTING LOG FILE of distributed/test_c10d_spawn_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_gloo_okbk7ytk) 2022-11-23T02:27:02.7439229Z 2022-11-23T02:27:03.0888084Z 2022-11-23T02:27:03.0889081Z real 1m40.912s 2022-11-23T02:27:03.0889510Z user 2m32.590s 2022-11-23T02:27:03.0889756Z sys 2m4.543s 2022-11-23T02:27:03.0890313Z + python test/run_test.py --verbose -i distributed/test_c10d_spawn_nccl 2022-11-23T02:27:05.5002488Z Ignoring disabled issues: [] 2022-11-23T02:27:05.5527779Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:27:05.5528871Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:27:05.5529223Z Selected tests: 2022-11-23T02:27:05.5529547Z distributed/test_c10d_spawn_nccl 2022-11-23T02:27:05.5554456Z Prioritized test from test file changes. 2022-11-23T02:27:05.5554973Z reordering tests for PR: 2022-11-23T02:27:05.5555251Z prioritized: [] 2022-11-23T02:27:05.5555750Z the rest: ['distributed/test_c10d_spawn_nccl'] 2022-11-23T02:27:05.5555963Z 2022-11-23T02:27:05.5556492Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:27:05.5557429Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:27:05.5564324Z parallel (file granularity) tests: 2022-11-23T02:27:05.5564788Z 2022-11-23T02:27:05.5565110Z serial (file granularity) tests: 2022-11-23T02:27:05.5565406Z distributed/test_c10d_spawn_nccl 2022-11-23T02:27:07.8495230Z Ignoring disabled issues: [] 2022-11-23T02:27:08.2619269Z Running distributed/test_c10d_spawn_nccl ... [2022-11-23 02:27:08.261271] 2022-11-23T02:27:08.2620697Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:27:08.261722] 2022-11-23T02:28:39.8198804Z 2022-11-23T02:28:39.8199883Z Expand the folded group to see the log file of distributed/test_c10d_spawn_nccl 2022-11-23T02:28:39.8200913Z ##[group]PRINTING LOG FILE of distributed/test_c10d_spawn_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_nccl_umefm4hr) 2022-11-23T02:28:39.8203152Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxx8d6hof 2022-11-23T02:28:39.8203750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxx8d6hof/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8204235Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8204595Z 2022-11-23T02:28:39.8205061Z 2022-11-23T02:28:39.8206398Z , <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_gather_base>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter_non_contiguous>]> 2022-11-23T02:28:39.8207747Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8208160Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8208826Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8209331Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8209752Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8210174Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8210590Z test_reduce (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8210999Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8211428Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:28:39.8212249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8212739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8213338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8213965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8214507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp874vyh_6 2022-11-23T02:28:39.8215088Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp874vyh_6/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8215551Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8215718Z 2022-11-23T02:28:39.8215748Z Running tests... 2022-11-23T02:28:39.8216303Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8216888Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8217500Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62363 2022-11-23T02:28:39.8218095Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62364 2022-11-23T02:28:39.8218774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8219274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8219975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8220503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8221111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8221609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8222179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8222674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8223159Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3yi6bbmy 2022-11-23T02:28:39.8224215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3yi6bbmy/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8224745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt1ilvars 2022-11-23T02:28:39.8225309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt1ilvars/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8225746Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8226068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8226475Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8226901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8227375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8227981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8228716Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8229388Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8229769Z ok (5.588s) 2022-11-23T02:28:39.8229927Z 2022-11-23T02:28:39.8230204Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8230546Z Ran 1 test in 5.589s 2022-11-23T02:28:39.8230712Z 2022-11-23T02:28:39.8230811Z OK 2022-11-23T02:28:39.8230927Z 2022-11-23T02:28:39.8231057Z Generating XML reports... 2022-11-23T02:28:39.8231714Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022715.xml 2022-11-23T02:28:39.8232461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8233000Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8233599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8234078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8234558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqzkqezol 2022-11-23T02:28:39.8235084Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqzkqezol/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8235520Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8235721Z 2022-11-23T02:28:39.8235833Z Running tests... 2022-11-23T02:28:39.8236227Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8236782Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8237390Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62582 2022-11-23T02:28:39.8237958Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62583 2022-11-23T02:28:39.8238556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8239023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8239605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8240056Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8240642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8241101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8241690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8242147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8242620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu3ks_9sj 2022-11-23T02:28:39.8243169Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu3ks_9sj/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8243704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfsh97wa_ 2022-11-23T02:28:39.8244225Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfsh97wa_/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8244656Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8244983Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8245374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8246041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8246563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8247063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8247712Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8248409Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8249413Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:28:39.8249897Z warnings.warn( 2022-11-23T02:28:39.8250710Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:28:39.8251266Z warnings.warn( 2022-11-23T02:28:39.8252047Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T02:28:39.8252610Z warnings.warn( 2022-11-23T02:28:39.8253352Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T02:28:39.8253900Z warnings.warn( 2022-11-23T02:28:39.8254140Z ok (5.689s) 2022-11-23T02:28:39.8254292Z 2022-11-23T02:28:39.8254571Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8254887Z Ran 1 test in 5.690s 2022-11-23T02:28:39.8255055Z 2022-11-23T02:28:39.8255155Z OK 2022-11-23T02:28:39.8255298Z 2022-11-23T02:28:39.8255498Z Generating XML reports... 2022-11-23T02:28:39.8256056Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022725.xml 2022-11-23T02:28:39.8256809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8257277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8257869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8258325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8258799Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3x7nht6a 2022-11-23T02:28:39.8259355Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3x7nht6a/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8259767Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8259967Z 2022-11-23T02:28:39.8260077Z Running tests... 2022-11-23T02:28:39.8260490Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8261050Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8261607Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62801 2022-11-23T02:28:39.8262160Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62802 2022-11-23T02:28:39.8262781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8263214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8263911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8264843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8265438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8265854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8266490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8266953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8267333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk6e7luv5 2022-11-23T02:28:39.8267865Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk6e7luv5/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8268508Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_bz5n6ta 2022-11-23T02:28:39.8269057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_bz5n6ta/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8269465Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8269849Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8270270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8270770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8271242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8271733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8272403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8273086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8273493Z ok (5.587s) 2022-11-23T02:28:39.8273650Z 2022-11-23T02:28:39.8273925Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8274271Z Ran 1 test in 5.588s 2022-11-23T02:28:39.8274413Z 2022-11-23T02:28:39.8274512Z OK 2022-11-23T02:28:39.8274653Z 2022-11-23T02:28:39.8274781Z Generating XML reports... 2022-11-23T02:28:39.8275433Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022735.xml 2022-11-23T02:28:39.8276161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8276620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8277209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8277690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8278144Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx_kq6ps6 2022-11-23T02:28:39.8278687Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx_kq6ps6/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8279120Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8279319Z 2022-11-23T02:28:39.8279408Z Running tests... 2022-11-23T02:28:39.8279822Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8280372Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8280965Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63022 2022-11-23T02:28:39.8281505Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63023 2022-11-23T02:28:39.8282227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8282705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8283359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8283757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8284351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8284806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8285363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8285836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8286385Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6o9egr6 2022-11-23T02:28:39.8286935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6o9egr6/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8287450Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu0l1il3y 2022-11-23T02:28:39.8287993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu0l1il3y/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8288434Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8288828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8289324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8289732Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8290141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8290617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8291391Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8292101Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8292572Z ok (5.587s) 2022-11-23T02:28:39.8292628Z 2022-11-23T02:28:39.8292904Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8293240Z Ran 1 test in 5.588s 2022-11-23T02:28:39.8293407Z 2022-11-23T02:28:39.8293503Z OK 2022-11-23T02:28:39.8293617Z 2022-11-23T02:28:39.8293748Z Generating XML reports... 2022-11-23T02:28:39.8294396Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022744.xml 2022-11-23T02:28:39.8295136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8295582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8296168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8296650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8297126Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqcg500t6 2022-11-23T02:28:39.8297648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqcg500t6/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8298085Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8298285Z 2022-11-23T02:28:39.8298394Z Running tests... 2022-11-23T02:28:39.8298882Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8299336Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8299977Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63243 2022-11-23T02:28:39.8300549Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63244 2022-11-23T02:28:39.8301146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8301612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8302191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8302671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8303229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8303692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8304847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8305301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8305711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvmapkzn6 2022-11-23T02:28:39.8306234Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvmapkzn6/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8306779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc2_edjfv 2022-11-23T02:28:39.8307295Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc2_edjfv/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8307726Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8308142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8308614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8309030Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8309442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8309941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8310587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8311288Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8311683Z ok (5.588s) 2022-11-23T02:28:39.8311837Z 2022-11-23T02:28:39.8312111Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8312498Z Ran 1 test in 5.589s 2022-11-23T02:28:39.8312590Z 2022-11-23T02:28:39.8312686Z OK 2022-11-23T02:28:39.8312826Z 2022-11-23T02:28:39.8312951Z Generating XML reports... 2022-11-23T02:28:39.8313586Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022754.xml 2022-11-23T02:28:39.8314340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8314808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8315395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8315852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8316329Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpupfpf1pr 2022-11-23T02:28:39.8316877Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpupfpf1pr/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8317289Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8317544Z 2022-11-23T02:28:39.8317604Z Running tests... 2022-11-23T02:28:39.8318022Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8318664Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8319242Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63462 2022-11-23T02:28:39.8319882Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63463 2022-11-23T02:28:39.8320524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8320961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8321550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8322031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8322735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8323170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8323755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8324227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8324682Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzvvhuhpu 2022-11-23T02:28:39.8325233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzvvhuhpu/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8325775Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpofkq8q_a 2022-11-23T02:28:39.8326314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpofkq8q_a/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8326724Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8327062Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8327481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8327960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8328454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8328945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8329622Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8330300Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8330699Z ok (5.686s) 2022-11-23T02:28:39.8330850Z 2022-11-23T02:28:39.8331125Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8331469Z Ran 1 test in 5.687s 2022-11-23T02:28:39.8331614Z 2022-11-23T02:28:39.8331715Z OK 2022-11-23T02:28:39.8331852Z 2022-11-23T02:28:39.8331980Z Generating XML reports... 2022-11-23T02:28:39.8332634Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022804.xml 2022-11-23T02:28:39.8333358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8333821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8334402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8334876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8335330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9ntol1a4 2022-11-23T02:28:39.8335880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9ntol1a4/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8336374Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8336590Z 2022-11-23T02:28:39.8336678Z Running tests... 2022-11-23T02:28:39.8337099Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8337653Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8338319Z test_reduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63681 2022-11-23T02:28:39.8338751Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63682 2022-11-23T02:28:39.8339368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8339831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8340398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8340978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8341610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8342110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8342705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8343212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8343708Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfmgrw4ln 2022-11-23T02:28:39.8344601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfmgrw4ln/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8345105Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6btm7wo8 2022-11-23T02:28:39.8345684Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6btm7wo8/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8346121Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8346506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8346952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8347364Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8347774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8348243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8348921Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8349620Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8350005Z ok (5.586s) 2022-11-23T02:28:39.8350156Z 2022-11-23T02:28:39.8350501Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8350835Z Ran 1 test in 5.587s 2022-11-23T02:28:39.8350942Z 2022-11-23T02:28:39.8351045Z OK 2022-11-23T02:28:39.8351159Z 2022-11-23T02:28:39.8351289Z Generating XML reports... 2022-11-23T02:28:39.8351937Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022813.xml 2022-11-23T02:28:39.8352683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8353117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8353703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8354186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8354744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm62q9rgy 2022-11-23T02:28:39.8355291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm62q9rgy/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8355728Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8355961Z 2022-11-23T02:28:39.8356050Z Running tests... 2022-11-23T02:28:39.8356446Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8356997Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8357585Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63900 2022-11-23T02:28:39.8358135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63901 2022-11-23T02:28:39.8358815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8359278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8359866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8360343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8360909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8361364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8361949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8362393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8362871Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp95my7g_1 2022-11-23T02:28:39.8363417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95my7g_1/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8363951Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvrx03l3p 2022-11-23T02:28:39.8364467Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvrx03l3p/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8364900Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8365313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8365787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8366188Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8366598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8367089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8367747Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8368453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8368856Z ok (5.491s) 2022-11-23T02:28:39.8369011Z 2022-11-23T02:28:39.8369261Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8369598Z Ran 1 test in 5.492s 2022-11-23T02:28:39.8369762Z 2022-11-23T02:28:39.8369858Z OK 2022-11-23T02:28:39.8369996Z 2022-11-23T02:28:39.8370240Z Generating XML reports... 2022-11-23T02:28:39.8370873Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022823.xml 2022-11-23T02:28:39.8371619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8372077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8372697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8373226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8373726Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphehqkik9 2022-11-23T02:28:39.8374309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphehqkik9/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8374745Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8374952Z 2022-11-23T02:28:39.8375070Z Running tests... 2022-11-23T02:28:39.8375507Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8376071Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:28:39.8376721Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64119 2022-11-23T02:28:39.8377405Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64120 2022-11-23T02:28:39.8378067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8378538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8379161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8379667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8380288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:28:39.8380752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:28:39.8381359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:28:39.8381849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:28:39.8382302Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdn7_ck8u 2022-11-23T02:28:39.8382843Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdn7_ck8u/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8383388Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgfs9zo6c 2022-11-23T02:28:39.8384124Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgfs9zo6c/_remote_module_non_scriptable.py 2022-11-23T02:28:39.8384628Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8385026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:28:39.8385531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:28:39.8385939Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:28:39.8386360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:28:39.8386841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:28:39.8387431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8388109Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:28:39.8388516Z ok (5.693s) 2022-11-23T02:28:39.8388668Z 2022-11-23T02:28:39.8388940Z ---------------------------------------------------------------------- 2022-11-23T02:28:39.8389259Z Ran 1 test in 5.694s 2022-11-23T02:28:39.8389431Z 2022-11-23T02:28:39.8389530Z OK 2022-11-23T02:28:39.8389670Z 2022-11-23T02:28:39.8389796Z Generating XML reports... 2022-11-23T02:28:39.8390449Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022833.xml 2022-11-23T02:28:39.8390932Z 2022-11-23T02:28:39.8391371Z ##[endgroup] 2022-11-23T02:28:39.8392017Z FINISHED PRINTING LOG FILE of distributed/test_c10d_spawn_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_nccl_umefm4hr) 2022-11-23T02:28:39.8392385Z 2022-11-23T02:28:40.1841599Z 2022-11-23T02:28:40.1842256Z real 1m37.095s 2022-11-23T02:28:40.1842534Z user 2m43.370s 2022-11-23T02:28:40.1842833Z sys 2m5.095s 2022-11-23T02:28:40.1844337Z + python test/run_test.py --verbose -i distributed/test_store 2022-11-23T02:28:42.5896077Z Ignoring disabled issues: [] 2022-11-23T02:28:42.6423120Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:28:42.6423701Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:28:42.6424409Z Selected tests: 2022-11-23T02:28:42.6425022Z distributed/test_store 2022-11-23T02:28:42.6448939Z Prioritized test from test file changes. 2022-11-23T02:28:42.6449252Z reordering tests for PR: 2022-11-23T02:28:42.6449519Z prioritized: [] 2022-11-23T02:28:42.6449983Z the rest: ['distributed/test_store'] 2022-11-23T02:28:42.6450170Z 2022-11-23T02:28:42.6450699Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:28:42.6451618Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:28:42.6457033Z parallel (file granularity) tests: 2022-11-23T02:28:42.6457321Z 2022-11-23T02:28:42.6457648Z serial (file granularity) tests: 2022-11-23T02:28:42.6458070Z distributed/test_store 2022-11-23T02:28:44.9406122Z Ignoring disabled issues: [] 2022-11-23T02:28:45.3471193Z Running distributed/test_store ... [2022-11-23 02:28:45.346628] 2022-11-23T02:28:45.3473761Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_store.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:28:45.347068] 2022-11-23T02:30:57.1529438Z 2022-11-23T02:30:57.1530011Z Expand the folded group to see the log file of distributed/test_store 2022-11-23T02:30:57.1533312Z ##[group]PRINTING LOG FILE of distributed/test_store (/var/lib/jenkins/workspace/test/test-reports/distributed-test_store_fgs2jmwy) 2022-11-23T02:30:57.1534112Z , <__main__.FileStoreTest testMethod=test_init_pg_and_rpc_with_same_file>, <__main__.FileStoreTest testMethod=test_refcount>, <__main__.FileStoreTest testMethod=test_set_get>]> 2022-11-23T02:30:57.1534716Z test_compare_set (__main__.FileStoreTest) 2022-11-23T02:30:57.1536839Z test_init_pg_and_rpc_with_same_file (__main__.FileStoreTest) 2022-11-23T02:30:57.1537229Z test_refcount (__main__.FileStoreTest) 2022-11-23T02:30:57.1537547Z test_set_get (__main__.FileStoreTest) 2022-11-23T02:30:57.1538012Z , <__main__.HashStoreTest testMethod=test_set_get>]> 2022-11-23T02:30:57.1538442Z test_compare_set (__main__.HashStoreTest) 2022-11-23T02:30:57.1538773Z test_set_get (__main__.HashStoreTest) 2022-11-23T02:30:57.1539263Z , <__main__.PrefixFileStoreTest testMethod=test_set_get>]> 2022-11-23T02:30:57.1539769Z test_compare_set (__main__.PrefixFileStoreTest) 2022-11-23T02:30:57.1540091Z test_set_get (__main__.PrefixFileStoreTest) 2022-11-23T02:30:57.1540525Z ]> 2022-11-23T02:30:57.1540952Z test_get_underlying_store (__main__.PrefixStoreTest) 2022-11-23T02:30:57.1541747Z , <__main__.PrefixTCPStoreTest testMethod=test_set_get>]> 2022-11-23T02:30:57.1542345Z test_compare_set (__main__.PrefixTCPStoreTest) 2022-11-23T02:30:57.1542683Z test_set_get (__main__.PrefixTCPStoreTest) 2022-11-23T02:30:57.1543087Z ]> 2022-11-23T02:30:57.1543463Z test_set_get (__main__.PythonStoreTest) 2022-11-23T02:30:57.1544604Z ]> 2022-11-23T02:30:57.1545064Z test_nominal (__main__.RendezvousEnvTest) 2022-11-23T02:30:57.1545543Z , <__main__.RendezvousFileTest testMethod=test_nominal>]> 2022-11-23T02:30:57.1546026Z test_common_errors (__main__.RendezvousFileTest) 2022-11-23T02:30:57.1546360Z test_nominal (__main__.RendezvousFileTest) 2022-11-23T02:30:57.1547021Z , <__main__.RendezvousTCPTest testMethod=test_dns_timeout>, <__main__.RendezvousTCPTest testMethod=test_nominal>, <__main__.RendezvousTCPTest testMethod=test_tcp_store_timeout_set>]> 2022-11-23T02:30:57.1547886Z test_common_errors (__main__.RendezvousTCPTest) 2022-11-23T02:30:57.1548214Z test_dns_timeout (__main__.RendezvousTCPTest) 2022-11-23T02:30:57.1548544Z test_nominal (__main__.RendezvousTCPTest) 2022-11-23T02:30:57.1548886Z test_tcp_store_timeout_set (__main__.RendezvousTCPTest) 2022-11-23T02:30:57.1549409Z , <__main__.RendezvousTest testMethod=test_url_with_node_params>]> 2022-11-23T02:30:57.1549873Z test_unknown_handler (__main__.RendezvousTest) 2022-11-23T02:30:57.1550214Z test_url_with_node_params (__main__.RendezvousTest) 2022-11-23T02:30:57.1551181Z , <__main__.TCPStoreTest testMethod=test_compare_set>, <__main__.TCPStoreTest testMethod=test_init_pg_and_rpc_with_same_socket>, <__main__.TCPStoreTest testMethod=test_multi_worker_with_fixed_world_size>, <__main__.TCPStoreTest testMethod=test_multi_worker_with_nonfixed_world_size>, <__main__.TCPStoreTest testMethod=test_multitenancy>, <__main__.TCPStoreTest testMethod=test_numkeys_delkeys>, <__main__.TCPStoreTest testMethod=test_set_get>]> 2022-11-23T02:30:57.1552104Z test_address_already_in_use (__main__.TCPStoreTest) 2022-11-23T02:30:57.1552439Z test_compare_set (__main__.TCPStoreTest) 2022-11-23T02:30:57.1552773Z test_init_pg_and_rpc_with_same_socket (__main__.TCPStoreTest) 2022-11-23T02:30:57.1553147Z test_multi_worker_with_fixed_world_size (__main__.TCPStoreTest) 2022-11-23T02:30:57.1553541Z test_multi_worker_with_nonfixed_world_size (__main__.TCPStoreTest) 2022-11-23T02:30:57.1553896Z test_multitenancy (__main__.TCPStoreTest) 2022-11-23T02:30:57.1554201Z test_numkeys_delkeys (__main__.TCPStoreTest) 2022-11-23T02:30:57.1554520Z test_set_get (__main__.TCPStoreTest) 2022-11-23T02:30:57.1555219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1555661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1556241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1556709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1556942Z 2022-11-23T02:30:57.1557052Z Running tests... 2022-11-23T02:30:57.1557448Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1557971Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1558439Z test_compare_set (__main__.FileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1558762Z ok (1.760s) 2022-11-23T02:30:57.1558915Z 2022-11-23T02:30:57.1559183Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1559603Z Ran 1 test in 1.761s 2022-11-23T02:30:57.1559777Z 2022-11-23T02:30:57.1559852Z OK 2022-11-23T02:30:57.1559985Z 2022-11-23T02:30:57.1560109Z Generating XML reports... 2022-11-23T02:30:57.1560668Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022849.xml 2022-11-23T02:30:57.1561344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1561780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1562355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1562826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1563058Z 2022-11-23T02:30:57.1563149Z Running tests... 2022-11-23T02:30:57.1563618Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1564144Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1564636Z test_init_pg_and_rpc_with_same_file (__main__.FileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1565331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:57.1565997Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:57.1566393Z ok (1.833s) 2022-11-23T02:30:57.1566541Z 2022-11-23T02:30:57.1566807Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1567114Z Ran 1 test in 1.833s 2022-11-23T02:30:57.1567276Z 2022-11-23T02:30:57.1567368Z OK 2022-11-23T02:30:57.1567500Z 2022-11-23T02:30:57.1567624Z Generating XML reports... 2022-11-23T02:30:57.1568157Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022853.xml 2022-11-23T02:30:57.1568836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1569288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1569864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1570322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1570550Z 2022-11-23T02:30:57.1570659Z Running tests... 2022-11-23T02:30:57.1571062Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1571565Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1572023Z test_refcount (__main__.FileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1572352Z ok (1.734s) 2022-11-23T02:30:57.1572504Z 2022-11-23T02:30:57.1572773Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1573084Z Ran 1 test in 1.734s 2022-11-23T02:30:57.1573246Z 2022-11-23T02:30:57.1573337Z OK 2022-11-23T02:30:57.1573469Z 2022-11-23T02:30:57.1573590Z Generating XML reports... 2022-11-23T02:30:57.1574124Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022857.xml 2022-11-23T02:30:57.1574796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1575247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1575800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1576273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1576506Z 2022-11-23T02:30:57.1576613Z Running tests... 2022-11-23T02:30:57.1577029Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1577591Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1578054Z test_set_get (__main__.FileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1578384Z ok (1.757s) 2022-11-23T02:30:57.1578532Z 2022-11-23T02:30:57.1578780Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1579118Z Ran 1 test in 1.757s 2022-11-23T02:30:57.1579280Z 2022-11-23T02:30:57.1579371Z OK 2022-11-23T02:30:57.1579505Z 2022-11-23T02:30:57.1579628Z Generating XML reports... 2022-11-23T02:30:57.1580166Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022901.xml 2022-11-23T02:30:57.1580834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1581285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1581911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1582382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1582609Z 2022-11-23T02:30:57.1582717Z Running tests... 2022-11-23T02:30:57.1583121Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1583628Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1584604Z test_compare_set (__main__.HashStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1584944Z ok (1.704s) 2022-11-23T02:30:57.1585096Z 2022-11-23T02:30:57.1585357Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1585690Z Ran 1 test in 1.704s 2022-11-23T02:30:57.1585852Z 2022-11-23T02:30:57.1585945Z OK 2022-11-23T02:30:57.1586086Z 2022-11-23T02:30:57.1586216Z Generating XML reports... 2022-11-23T02:30:57.1586761Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-HashStoreTest-20221123022905.xml 2022-11-23T02:30:57.1587438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1587888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1588441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1588911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1589138Z 2022-11-23T02:30:57.1589248Z Running tests... 2022-11-23T02:30:57.1589653Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1590151Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1590607Z test_set_get (__main__.HashStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1590949Z ok (1.717s) 2022-11-23T02:30:57.1591109Z 2022-11-23T02:30:57.1591360Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1591698Z Ran 1 test in 1.717s 2022-11-23T02:30:57.1591861Z 2022-11-23T02:30:57.1591954Z OK 2022-11-23T02:30:57.1592088Z 2022-11-23T02:30:57.1592210Z Generating XML reports... 2022-11-23T02:30:57.1592742Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-HashStoreTest-20221123022909.xml 2022-11-23T02:30:57.1593415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1593870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1594425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1594893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1595128Z 2022-11-23T02:30:57.1595239Z Running tests... 2022-11-23T02:30:57.1595733Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1596250Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1596727Z test_compare_set (__main__.PrefixFileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1597079Z ok (1.720s) 2022-11-23T02:30:57.1597231Z 2022-11-23T02:30:57.1597478Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1597815Z Ran 1 test in 1.721s 2022-11-23T02:30:57.1597976Z 2022-11-23T02:30:57.1598070Z OK 2022-11-23T02:30:57.1598201Z 2022-11-23T02:30:57.1598307Z Generating XML reports... 2022-11-23T02:30:57.1598881Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PrefixFileStoreTest-20221123022914.xml 2022-11-23T02:30:57.1599569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1600108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1600670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1601145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1601380Z 2022-11-23T02:30:57.1601497Z Running tests... 2022-11-23T02:30:57.1601907Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1602411Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1602883Z test_set_get (__main__.PrefixFileStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1603226Z ok (1.722s) 2022-11-23T02:30:57.1603373Z 2022-11-23T02:30:57.1603622Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1603949Z Ran 1 test in 1.722s 2022-11-23T02:30:57.1604120Z 2022-11-23T02:30:57.1604216Z OK 2022-11-23T02:30:57.1604350Z 2022-11-23T02:30:57.1604458Z Generating XML reports... 2022-11-23T02:30:57.1605034Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PrefixFileStoreTest-20221123022918.xml 2022-11-23T02:30:57.1605726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1606182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1606739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1607214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1607441Z 2022-11-23T02:30:57.1607549Z Running tests... 2022-11-23T02:30:57.1607933Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1608453Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1608938Z test_get_underlying_store (__main__.PrefixStoreTest) ... ok (0.003s) 2022-11-23T02:30:57.1609165Z 2022-11-23T02:30:57.1609428Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1609733Z Ran 1 test in 0.003s 2022-11-23T02:30:57.1609893Z 2022-11-23T02:30:57.1609984Z OK 2022-11-23T02:30:57.1610120Z 2022-11-23T02:30:57.1610244Z Generating XML reports... 2022-11-23T02:30:57.1610873Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PrefixStoreTest-20221123022922.xml 2022-11-23T02:30:57.1611570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1612025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1612609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1613066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1613363Z 2022-11-23T02:30:57.1613480Z Running tests... 2022-11-23T02:30:57.1613887Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1614411Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1614870Z test_compare_set (__main__.PrefixTCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1615222Z ok (1.741s) 2022-11-23T02:30:57.1615369Z 2022-11-23T02:30:57.1615633Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1615940Z Ran 1 test in 1.741s 2022-11-23T02:30:57.1616099Z 2022-11-23T02:30:57.1616192Z OK 2022-11-23T02:30:57.1616327Z 2022-11-23T02:30:57.1616451Z Generating XML reports... 2022-11-23T02:30:57.1617001Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PrefixTCPStoreTest-20221123022924.xml 2022-11-23T02:30:57.1617748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1618199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1618772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1619218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1619448Z 2022-11-23T02:30:57.1619555Z Running tests... 2022-11-23T02:30:57.1619955Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1620460Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1620931Z test_set_get (__main__.PrefixTCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1621279Z ok (1.715s) 2022-11-23T02:30:57.1621426Z 2022-11-23T02:30:57.1621687Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1622003Z Ran 1 test in 1.716s 2022-11-23T02:30:57.1622165Z 2022-11-23T02:30:57.1622262Z OK 2022-11-23T02:30:57.1622397Z 2022-11-23T02:30:57.1622518Z Generating XML reports... 2022-11-23T02:30:57.1623065Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PrefixTCPStoreTest-20221123022928.xml 2022-11-23T02:30:57.1623754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1624445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1625025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1625477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1625706Z 2022-11-23T02:30:57.1625814Z Running tests... 2022-11-23T02:30:57.1626219Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1626731Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1627192Z test_set_get (__main__.PythonStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1627527Z ok (1.729s) 2022-11-23T02:30:57.1627676Z 2022-11-23T02:30:57.1627947Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1628257Z Ran 1 test in 1.729s 2022-11-23T02:30:57.1628418Z 2022-11-23T02:30:57.1628509Z OK 2022-11-23T02:30:57.1628641Z 2022-11-23T02:30:57.1628765Z Generating XML reports... 2022-11-23T02:30:57.1629305Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-PythonStoreTest-20221123022932.xml 2022-11-23T02:30:57.1629981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1630433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1631001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1631533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1631778Z 2022-11-23T02:30:57.1631889Z Running tests... 2022-11-23T02:30:57.1632307Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1632814Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1633285Z test_nominal (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1633634Z ok (1.711s) 2022-11-23T02:30:57.1633781Z 2022-11-23T02:30:57.1634049Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1634360Z Ran 1 test in 1.712s 2022-11-23T02:30:57.1634521Z 2022-11-23T02:30:57.1634616Z OK 2022-11-23T02:30:57.1634747Z 2022-11-23T02:30:57.1634871Z Generating XML reports... 2022-11-23T02:30:57.1635415Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousEnvTest-20221123022936.xml 2022-11-23T02:30:57.1636195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1636644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1637219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1637676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1637911Z 2022-11-23T02:30:57.1638025Z Running tests... 2022-11-23T02:30:57.1638429Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1638935Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1639408Z test_common_errors (__main__.RendezvousFileTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1639764Z ok (1.720s) 2022-11-23T02:30:57.1639911Z 2022-11-23T02:30:57.1640181Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1640493Z Ran 1 test in 1.720s 2022-11-23T02:30:57.1640656Z 2022-11-23T02:30:57.1640748Z OK 2022-11-23T02:30:57.1640882Z 2022-11-23T02:30:57.1641007Z Generating XML reports... 2022-11-23T02:30:57.1641554Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousFileTest-20221123022940.xml 2022-11-23T02:30:57.1642243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1642687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1643257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1643712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1643943Z 2022-11-23T02:30:57.1644056Z Running tests... 2022-11-23T02:30:57.1644464Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1644973Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1645439Z test_nominal (__main__.RendezvousFileTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1645780Z ok (1.715s) 2022-11-23T02:30:57.1645926Z 2022-11-23T02:30:57.1646189Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1646499Z Ran 1 test in 1.715s 2022-11-23T02:30:57.1646661Z 2022-11-23T02:30:57.1646752Z OK 2022-11-23T02:30:57.1646885Z 2022-11-23T02:30:57.1647008Z Generating XML reports... 2022-11-23T02:30:57.1647555Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousFileTest-20221123022945.xml 2022-11-23T02:30:57.1648239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1648690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1649322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1649784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1650014Z 2022-11-23T02:30:57.1650121Z Running tests... 2022-11-23T02:30:57.1650527Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1651030Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1651498Z test_common_errors (__main__.RendezvousTCPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1651856Z ok (1.712s) 2022-11-23T02:30:57.1652009Z 2022-11-23T02:30:57.1652254Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1652590Z Ran 1 test in 1.713s 2022-11-23T02:30:57.1652751Z 2022-11-23T02:30:57.1652844Z OK 2022-11-23T02:30:57.1653032Z 2022-11-23T02:30:57.1653155Z Generating XML reports... 2022-11-23T02:30:57.1653709Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022949.xml 2022-11-23T02:30:57.1654391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1654841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1655391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1655863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1656090Z 2022-11-23T02:30:57.1656209Z Running tests... 2022-11-23T02:30:57.1656608Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1657111Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1657580Z test_dns_timeout (__main__.RendezvousTCPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1658234Z [W socket.cpp:601] [c10d] The IPv6 network addresses of (dnsnotexist, 23456) cannot be retrieved (gai error: -2 - Name or service not known). 2022-11-23T02:30:57.1658776Z [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (dnsnotexist, 23456). 2022-11-23T02:30:57.1659110Z ok (1.722s) 2022-11-23T02:30:57.1659260Z 2022-11-23T02:30:57.1659529Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1659857Z Ran 1 test in 1.722s 2022-11-23T02:30:57.1660018Z 2022-11-23T02:30:57.1660091Z OK 2022-11-23T02:30:57.1660221Z 2022-11-23T02:30:57.1660345Z Generating XML reports... 2022-11-23T02:30:57.1660920Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022953.xml 2022-11-23T02:30:57.1661612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1662054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1662624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1663093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1663326Z 2022-11-23T02:30:57.1663415Z Running tests... 2022-11-23T02:30:57.1663817Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1664546Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1665015Z test_nominal (__main__.RendezvousTCPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1665336Z ok (1.719s) 2022-11-23T02:30:57.1665487Z 2022-11-23T02:30:57.1665755Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1666093Z Ran 1 test in 1.719s 2022-11-23T02:30:57.1666263Z 2022-11-23T02:30:57.1666339Z OK 2022-11-23T02:30:57.1666479Z 2022-11-23T02:30:57.1666683Z Generating XML reports... 2022-11-23T02:30:57.1667275Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022957.xml 2022-11-23T02:30:57.1667944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1668411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1669002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1669581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1669818Z 2022-11-23T02:30:57.1669909Z Running tests... 2022-11-23T02:30:57.1670325Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1670860Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1671439Z test_tcp_store_timeout_set (__main__.RendezvousTCPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1671788Z ok (11.855s) 2022-11-23T02:30:57.1671943Z 2022-11-23T02:30:57.1672214Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1672541Z Ran 1 test in 11.855s 2022-11-23T02:30:57.1672684Z 2022-11-23T02:30:57.1672779Z OK 2022-11-23T02:30:57.1672911Z 2022-11-23T02:30:57.1673034Z Generating XML reports... 2022-11-23T02:30:57.1673595Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123023001.xml 2022-11-23T02:30:57.1674260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1674709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1675282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1675760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1675993Z 2022-11-23T02:30:57.1676083Z Running tests... 2022-11-23T02:30:57.1676484Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1677011Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1677461Z test_unknown_handler (__main__.RendezvousTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1677804Z ok (1.720s) 2022-11-23T02:30:57.1677950Z 2022-11-23T02:30:57.1678213Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1678537Z Ran 1 test in 1.721s 2022-11-23T02:30:57.1678678Z 2022-11-23T02:30:57.1678771Z OK 2022-11-23T02:30:57.1678902Z 2022-11-23T02:30:57.1679026Z Generating XML reports... 2022-11-23T02:30:57.1679576Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTest-20221123023015.xml 2022-11-23T02:30:57.1680239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1680689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1681262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1681729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1681957Z 2022-11-23T02:30:57.1682048Z Running tests... 2022-11-23T02:30:57.1682451Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1682976Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1683434Z test_url_with_node_params (__main__.RendezvousTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1683781Z ok (1.746s) 2022-11-23T02:30:57.1683927Z 2022-11-23T02:30:57.1684196Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1684578Z Ran 1 test in 1.746s 2022-11-23T02:30:57.1684731Z 2022-11-23T02:30:57.1684825Z OK 2022-11-23T02:30:57.1684957Z 2022-11-23T02:30:57.1685079Z Generating XML reports... 2022-11-23T02:30:57.1685635Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-RendezvousTest-20221123023019.xml 2022-11-23T02:30:57.1686291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1686740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1687313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1687784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1687994Z 2022-11-23T02:30:57.1688104Z Running tests... 2022-11-23T02:30:57.1688503Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1689093Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1689551Z test_address_already_in_use (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1690175Z [W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:36109 (errno: 98 - Address already in use). 2022-11-23T02:30:57.1690775Z [W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:36109 (errno: 98 - Address already in use). 2022-11-23T02:30:57.1691238Z [E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address. 2022-11-23T02:30:57.1691551Z ok (1.730s) 2022-11-23T02:30:57.1691697Z 2022-11-23T02:30:57.1691964Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1692294Z Ran 1 test in 1.730s 2022-11-23T02:30:57.1692454Z 2022-11-23T02:30:57.1692527Z OK 2022-11-23T02:30:57.1692665Z 2022-11-23T02:30:57.1692789Z Generating XML reports... 2022-11-23T02:30:57.1693336Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023024.xml 2022-11-23T02:30:57.1694003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1694434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1695006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1695472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1695702Z 2022-11-23T02:30:57.1695811Z Running tests... 2022-11-23T02:30:57.1696201Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1696724Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1697182Z test_compare_set (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1697503Z ok (1.728s) 2022-11-23T02:30:57.1697654Z 2022-11-23T02:30:57.1697917Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1698244Z Ran 1 test in 1.728s 2022-11-23T02:30:57.1698407Z 2022-11-23T02:30:57.1698480Z OK 2022-11-23T02:30:57.1698664Z 2022-11-23T02:30:57.1698787Z Generating XML reports... 2022-11-23T02:30:57.1699330Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023028.xml 2022-11-23T02:30:57.1699996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1700427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1700998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1701464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1701697Z 2022-11-23T02:30:57.1701787Z Running tests... 2022-11-23T02:30:57.1702245Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1702781Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1703271Z test_init_pg_and_rpc_with_same_socket (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1703758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:57.1704783Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:57.1705180Z ok (1.806s) 2022-11-23T02:30:57.1705327Z 2022-11-23T02:30:57.1705572Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1705905Z Ran 1 test in 1.807s 2022-11-23T02:30:57.1706072Z 2022-11-23T02:30:57.1706163Z OK 2022-11-23T02:30:57.1706387Z 2022-11-23T02:30:57.1706511Z Generating XML reports... 2022-11-23T02:30:57.1707047Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023032.xml 2022-11-23T02:30:57.1707711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1708162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1708719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1709192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1709425Z 2022-11-23T02:30:57.1709534Z Running tests... 2022-11-23T02:30:57.1709939Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1710441Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1711016Z test_multi_worker_with_fixed_world_size (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1711387Z ok (1.741s) 2022-11-23T02:30:57.1711536Z 2022-11-23T02:30:57.1711789Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1712124Z Ran 1 test in 1.741s 2022-11-23T02:30:57.1712289Z 2022-11-23T02:30:57.1712381Z OK 2022-11-23T02:30:57.1712512Z 2022-11-23T02:30:57.1712636Z Generating XML reports... 2022-11-23T02:30:57.1713168Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023036.xml 2022-11-23T02:30:57.1713834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1714284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1714837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1715312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1715546Z 2022-11-23T02:30:57.1715657Z Running tests... 2022-11-23T02:30:57.1716060Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1716564Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1717056Z test_multi_worker_with_nonfixed_world_size (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1717417Z ok (1.733s) 2022-11-23T02:30:57.1717564Z 2022-11-23T02:30:57.1717808Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1718138Z Ran 1 test in 1.733s 2022-11-23T02:30:57.1718298Z 2022-11-23T02:30:57.1718388Z OK 2022-11-23T02:30:57.1718520Z 2022-11-23T02:30:57.1718644Z Generating XML reports... 2022-11-23T02:30:57.1719173Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023040.xml 2022-11-23T02:30:57.1719925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1720389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1720945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1721435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1721664Z 2022-11-23T02:30:57.1721772Z Running tests... 2022-11-23T02:30:57.1722177Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1722683Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1723149Z test_multitenancy (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1723486Z ok (1.705s) 2022-11-23T02:30:57.1723633Z 2022-11-23T02:30:57.1723879Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1724264Z Ran 1 test in 1.706s 2022-11-23T02:30:57.1724424Z 2022-11-23T02:30:57.1724521Z OK 2022-11-23T02:30:57.1724653Z 2022-11-23T02:30:57.1724779Z Generating XML reports... 2022-11-23T02:30:57.1725313Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023044.xml 2022-11-23T02:30:57.1725986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1726434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1726989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1727459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1727689Z 2022-11-23T02:30:57.1727797Z Running tests... 2022-11-23T02:30:57.1728199Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1728710Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1729175Z test_numkeys_delkeys (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1729519Z ok (3.748s) 2022-11-23T02:30:57.1729667Z 2022-11-23T02:30:57.1729914Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1730242Z Ran 1 test in 3.748s 2022-11-23T02:30:57.1730404Z 2022-11-23T02:30:57.1730497Z OK 2022-11-23T02:30:57.1730628Z 2022-11-23T02:30:57.1730750Z Generating XML reports... 2022-11-23T02:30:57.1731278Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023048.xml 2022-11-23T02:30:57.1731944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:57.1732393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:57.1732945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:57.1733422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:57.1733655Z 2022-11-23T02:30:57.1733764Z Running tests... 2022-11-23T02:30:57.1734166Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1734670Z Test results will be stored in test-reports/python-unittest/distributed.test_store 2022-11-23T02:30:57.1735121Z test_set_get (__main__.TCPStoreTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:57.1735450Z ok (1.711s) 2022-11-23T02:30:57.1735597Z 2022-11-23T02:30:57.1735843Z ---------------------------------------------------------------------- 2022-11-23T02:30:57.1736170Z Ran 1 test in 1.712s 2022-11-23T02:30:57.1736334Z 2022-11-23T02:30:57.1736425Z OK 2022-11-23T02:30:57.1736558Z 2022-11-23T02:30:57.1736664Z Generating XML reports... 2022-11-23T02:30:57.1737213Z Generated XML report: test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023054.xml 2022-11-23T02:30:57.1737590Z 2022-11-23T02:30:57.1738105Z ##[endgroup] 2022-11-23T02:30:57.1738652Z FINISHED PRINTING LOG FILE of distributed/test_store (/var/lib/jenkins/workspace/test/test-reports/distributed-test_store_fgs2jmwy) 2022-11-23T02:30:57.1738955Z 2022-11-23T02:30:57.5548185Z 2022-11-23T02:30:57.5548609Z real 2m17.371s 2022-11-23T02:30:57.5548897Z user 2m52.718s 2022-11-23T02:30:57.5549165Z sys 2m39.276s 2022-11-23T02:30:57.5549696Z + python test/run_test.py --verbose -i distributed/test_pg_wrapper 2022-11-23T02:30:59.8958577Z Ignoring disabled issues: [] 2022-11-23T02:30:59.9484658Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:30:59.9485230Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:30:59.9485558Z Selected tests: 2022-11-23T02:30:59.9486133Z distributed/test_pg_wrapper 2022-11-23T02:30:59.9517976Z Prioritized test from test file changes. 2022-11-23T02:30:59.9518370Z reordering tests for PR: 2022-11-23T02:30:59.9518694Z prioritized: [] 2022-11-23T02:30:59.9519182Z the rest: ['distributed/test_pg_wrapper'] 2022-11-23T02:30:59.9519399Z 2022-11-23T02:30:59.9519849Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:30:59.9520803Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:30:59.9523278Z parallel (file granularity) tests: 2022-11-23T02:30:59.9523559Z 2022-11-23T02:30:59.9523862Z serial (file granularity) tests: 2022-11-23T02:30:59.9524164Z distributed/test_pg_wrapper 2022-11-23T02:31:02.2968398Z Ignoring disabled issues: [] 2022-11-23T02:31:02.7209653Z Running distributed/test_pg_wrapper ... [2022-11-23 02:31:02.720373] 2022-11-23T02:31:02.7211128Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:31:02.720851] 2022-11-23T02:32:56.0519588Z 2022-11-23T02:32:56.0520147Z Expand the folded group to see the log file of distributed/test_pg_wrapper 2022-11-23T02:32:56.0522731Z ##[group]PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_97po5h7m) 2022-11-23T02:32:56.0523256Z 2022-11-23T02:32:56.0523580Z 2022-11-23T02:32:56.0525434Z , <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:32:56.0527105Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0527709Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0528170Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0528765Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0529284Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0529747Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0530636Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0531245Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0531763Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:32:56.0532970Z , <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:32:56.0534118Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:32:56.0534649Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:32:56.0535309Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:32:56.0535758Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:32:56.0536348Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:32:56.0537119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0537762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0538352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0538990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0539229Z 2022-11-23T02:32:56.0539338Z Running tests... 2022-11-23T02:32:56.0539852Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0540477Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0541044Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0541648Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66798 2022-11-23T02:32:56.0542107Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66799 2022-11-23T02:32:56.0542713Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66800 2022-11-23T02:32:56.0543154Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66801 2022-11-23T02:32:56.0544125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0544712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0545585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0546248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0546839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0547447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0548016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0548615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0549229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0549720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0550412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0551084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0551736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0552221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0552909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0553399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0553989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0554448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0555081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0555551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0556281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0556785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0557442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0557938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0558748Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0559445Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0560308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0561154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0561656Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:32:56.0562278Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:32:56.0562872Z [E ProcessGroupGloo.cpp:137] Rank 2 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:32:56.0563706Z [E ProcessGroupGloo.cpp:137] Rank 3 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:32:56.0564136Z ok (4.379s) 2022-11-23T02:32:56.0564286Z 2022-11-23T02:32:56.0564716Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0565067Z Ran 1 test in 4.379s 2022-11-23T02:32:56.0565227Z 2022-11-23T02:32:56.0565300Z OK 2022-11-23T02:32:56.0565435Z 2022-11-23T02:32:56.0565561Z Generating XML reports... 2022-11-23T02:32:56.0566368Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023106.xml 2022-11-23T02:32:56.0567264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0567712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0568462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0568948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0569182Z 2022-11-23T02:32:56.0569303Z Running tests... 2022-11-23T02:32:56.0569853Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0570404Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0571176Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0571680Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67165 2022-11-23T02:32:56.0572306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67166 2022-11-23T02:32:56.0572747Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67167 2022-11-23T02:32:56.0573335Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67168 2022-11-23T02:32:56.0573960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0574567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0575148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0575668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0576427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0576875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0577457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0577911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0578667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0579110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0579661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0580126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0580700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0581142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0581691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0582318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0582765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0583221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0583685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0584463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0584959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0585445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0585933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0586600Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0587136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0587761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0588438Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0589244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0589844Z ok (4.339s) 2022-11-23T02:32:56.0589977Z 2022-11-23T02:32:56.0590254Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0590586Z Ran 1 test in 4.340s 2022-11-23T02:32:56.0590748Z 2022-11-23T02:32:56.0590840Z OK 2022-11-23T02:32:56.0590972Z 2022-11-23T02:32:56.0591079Z Generating XML reports... 2022-11-23T02:32:56.0591705Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023113.xml 2022-11-23T02:32:56.0592428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0592875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0593427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0593991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0594219Z 2022-11-23T02:32:56.0594328Z Running tests... 2022-11-23T02:32:56.0594717Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0595247Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0595786Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0596300Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67532 2022-11-23T02:32:56.0596730Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67533 2022-11-23T02:32:56.0597161Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67534 2022-11-23T02:32:56.0597603Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67535 2022-11-23T02:32:56.0598193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0598638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0599216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0599680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0600235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0600682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0601256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0601700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0602271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0602723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0603531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0603977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0604562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0605010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0605575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0606018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0606457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0607002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0607461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0607918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0608403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0608975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0609449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0610108Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0610638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0611354Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0612016Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0612694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0613082Z ok (6.171s) 2022-11-23T02:32:56.0613229Z 2022-11-23T02:32:56.0613478Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0613806Z Ran 1 test in 6.171s 2022-11-23T02:32:56.0613969Z 2022-11-23T02:32:56.0614060Z OK 2022-11-23T02:32:56.0614190Z 2022-11-23T02:32:56.0614314Z Generating XML reports... 2022-11-23T02:32:56.0614921Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023120.xml 2022-11-23T02:32:56.0615656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0616102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0616655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0617122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0617351Z 2022-11-23T02:32:56.0617458Z Running tests... 2022-11-23T02:32:56.0617864Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0618381Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0618932Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0619527Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67903 2022-11-23T02:32:56.0619966Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67904 2022-11-23T02:32:56.0620418Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67905 2022-11-23T02:32:56.0620864Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67906 2022-11-23T02:32:56.0621482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0621921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0622498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0622970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0623558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0624188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0624888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0625366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0625931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0626384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0627023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0627486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0628042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0628486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0629134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0629819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0630260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0630930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0631422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0631873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0632359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0632865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0633356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0634013Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0634555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0635210Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0635881Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0636576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0637109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0637596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0638078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:32:56.0638574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:32:56.0639224Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0639908Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0640572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0641254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0641646Z ok (6.272s) 2022-11-23T02:32:56.0641795Z 2022-11-23T02:32:56.0642068Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0642451Z Ran 1 test in 6.273s 2022-11-23T02:32:56.0642622Z 2022-11-23T02:32:56.0642719Z OK 2022-11-23T02:32:56.0642861Z 2022-11-23T02:32:56.0642990Z Generating XML reports... 2022-11-23T02:32:56.0643597Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023128.xml 2022-11-23T02:32:56.0644334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0644789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0645370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0645825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0646061Z 2022-11-23T02:32:56.0646175Z Running tests... 2022-11-23T02:32:56.0646657Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0647181Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0647742Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0648274Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68286 2022-11-23T02:32:56.0648738Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68287 2022-11-23T02:32:56.0649162Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68288 2022-11-23T02:32:56.0649604Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68289 2022-11-23T02:32:56.0650221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0650657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0651248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0651728Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0652310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0652743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0653316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0653788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0654374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0654802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0655379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0655851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0656408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0656855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0657439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0657913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0658331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0658818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0659294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0659810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0660315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0660810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0661299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0661770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0662423Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0663113Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0663793Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0664752Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0665284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0665774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0666261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:32:56.0666890Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0667423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:32:56.0668063Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0668729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0669405Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0669791Z ok (4.362s) 2022-11-23T02:32:56.0669940Z 2022-11-23T02:32:56.0670210Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0670520Z Ran 1 test in 4.362s 2022-11-23T02:32:56.0670682Z 2022-11-23T02:32:56.0670773Z OK 2022-11-23T02:32:56.0670906Z 2022-11-23T02:32:56.0671029Z Generating XML reports... 2022-11-23T02:32:56.0671638Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023137.xml 2022-11-23T02:32:56.0672363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0672812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0673383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0673836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0674064Z 2022-11-23T02:32:56.0674168Z Running tests... 2022-11-23T02:32:56.0674573Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0675181Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0675742Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0676320Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68665 2022-11-23T02:32:56.0676854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68666 2022-11-23T02:32:56.0677323Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68667 2022-11-23T02:32:56.0678035Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68668 2022-11-23T02:32:56.0678738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0679209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0679871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0680464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0681065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0681583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0682226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0682844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0683483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0684024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0684684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0685162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0685812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0686378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0703246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0703803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0704515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0704999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0705466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0705929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0706534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0707035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0707531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0708024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0708700Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0709491Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0710163Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0710831Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0711200Z ok (4.370s) 2022-11-23T02:32:56.0711351Z 2022-11-23T02:32:56.0711621Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0711948Z Ran 1 test in 4.370s 2022-11-23T02:32:56.0712109Z 2022-11-23T02:32:56.0712182Z OK 2022-11-23T02:32:56.0712313Z 2022-11-23T02:32:56.0712441Z Generating XML reports... 2022-11-23T02:32:56.0713203Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023144.xml 2022-11-23T02:32:56.0713956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0714395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0714970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0715438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0715669Z 2022-11-23T02:32:56.0715759Z Running tests... 2022-11-23T02:32:56.0716160Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0716687Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0717219Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0717805Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69032 2022-11-23T02:32:56.0718254Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69033 2022-11-23T02:32:56.0718699Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69034 2022-11-23T02:32:56.0719126Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69035 2022-11-23T02:32:56.0719730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0720179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0720750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0721198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0721777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0722227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0722775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0723235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0723803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0724242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0724792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0725254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0725826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0726271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0726819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0727278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0727713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0728168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0728626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0729096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0729579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0730120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0730621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0731113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0731751Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0732435Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0733115Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0733795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0734219Z ok (6.259s) 2022-11-23T02:32:56.0734371Z 2022-11-23T02:32:56.0734645Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0734975Z Ran 1 test in 6.259s 2022-11-23T02:32:56.0735136Z 2022-11-23T02:32:56.0735227Z OK 2022-11-23T02:32:56.0735343Z 2022-11-23T02:32:56.0735468Z Generating XML reports... 2022-11-23T02:32:56.0736090Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023151.xml 2022-11-23T02:32:56.0736822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0737251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0737824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0738287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0738519Z 2022-11-23T02:32:56.0738626Z Running tests... 2022-11-23T02:32:56.0739013Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0739545Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0740094Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0740600Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69403 2022-11-23T02:32:56.0741053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69404 2022-11-23T02:32:56.0741492Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69405 2022-11-23T02:32:56.0741931Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69406 2022-11-23T02:32:56.0742512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0742961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0743529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0744254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0744832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0745270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0745832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0746275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0746849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0747289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0747934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0748385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0748954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0749394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0749939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0750558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0750979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0751434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0751948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0752388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0752860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0753337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0753988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0754479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0755135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0755803Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0756488Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0757320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0757831Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0758286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0758932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:32:56.0759577Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0760099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:32:56.0760725Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0761392Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0762208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0762574Z ok (6.446s) 2022-11-23T02:32:56.0762701Z 2022-11-23T02:32:56.0762959Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0763286Z Ran 1 test in 6.446s 2022-11-23T02:32:56.0763441Z 2022-11-23T02:32:56.0763530Z OK 2022-11-23T02:32:56.0763657Z 2022-11-23T02:32:56.0763759Z Generating XML reports... 2022-11-23T02:32:56.0764361Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023159.xml 2022-11-23T02:32:56.0765065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0765615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0766166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0766616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0767018Z 2022-11-23T02:32:56.0767127Z Running tests... 2022-11-23T02:32:56.0767510Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0768043Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0768588Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0769108Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69786 2022-11-23T02:32:56.0769540Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69787 2022-11-23T02:32:56.0770048Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69788 2022-11-23T02:32:56.0770647Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69789 2022-11-23T02:32:56.0771216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0771649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0772371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0772837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0773392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0773832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0774409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0774872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0775422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0776017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0776747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0777192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0777769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0778210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0778771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0779217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0779656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0780128Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:32:56.0780573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:32:56.0781046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0781681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0782159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0782615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:32:56.0783310Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0783831Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:32:56.0784847Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0785514Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0786191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:32:56.0786716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0787205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0787921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:32:56.0788395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:32:56.0789017Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0789651Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0790303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0791135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:32:56.0791524Z ok (4.473s) 2022-11-23T02:32:56.0791672Z 2022-11-23T02:32:56.0791919Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0792250Z Ran 1 test in 4.473s 2022-11-23T02:32:56.0792411Z 2022-11-23T02:32:56.0792507Z OK 2022-11-23T02:32:56.0792639Z 2022-11-23T02:32:56.0792745Z Generating XML reports... 2022-11-23T02:32:56.0793368Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023208.xml 2022-11-23T02:32:56.0794093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0794542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0795095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0795891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0796122Z 2022-11-23T02:32:56.0796228Z Running tests... 2022-11-23T02:32:56.0796629Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0797152Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0797668Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0798162Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70165 2022-11-23T02:32:56.0798753Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70166 2022-11-23T02:32:56.0799330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0799761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0800310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0800742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0801295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0801795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0802335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0802781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0803203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0803671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0804120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0804580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0805213Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0805957Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0806439Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:32:56.0806892Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:32:56.0807403Z ok (4.128s) 2022-11-23T02:32:56.0807549Z 2022-11-23T02:32:56.0807800Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0808129Z Ran 1 test in 4.128s 2022-11-23T02:32:56.0808289Z 2022-11-23T02:32:56.0808380Z OK 2022-11-23T02:32:56.0808511Z 2022-11-23T02:32:56.0808633Z Generating XML reports... 2022-11-23T02:32:56.0809323Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023215.xml 2022-11-23T02:32:56.0810063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0810510Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0811065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0811534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0811764Z 2022-11-23T02:32:56.0811872Z Running tests... 2022-11-23T02:32:56.0812426Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0812921Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0813436Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0813927Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70374 2022-11-23T02:32:56.0814527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70375 2022-11-23T02:32:56.0815133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0815578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0816145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0816592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0817166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0817612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0818159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0818621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0819125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0819603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0820071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0820562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0821370Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0822028Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0822563Z ok (5.864s) 2022-11-23T02:32:56.0822714Z 2022-11-23T02:32:56.0822981Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0823368Z Ran 1 test in 5.864s 2022-11-23T02:32:56.0823529Z 2022-11-23T02:32:56.0823604Z OK 2022-11-23T02:32:56.0823740Z 2022-11-23T02:32:56.0824124Z Generating XML reports... 2022-11-23T02:32:56.0824767Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023222.xml 2022-11-23T02:32:56.0825493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0825921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0826490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0826960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0827188Z 2022-11-23T02:32:56.0827277Z Running tests... 2022-11-23T02:32:56.0827680Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0828213Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0828761Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0829264Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70599 2022-11-23T02:32:56.0829867Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70600 2022-11-23T02:32:56.0830450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0830877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0831598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0832059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0832629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0833057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0833627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0834245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0834850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0835306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0835785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0836282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0836916Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0837837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0838363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0838837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0839437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:32:56.0840090Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:32:56.0840463Z ok (5.958s) 2022-11-23T02:32:56.0840606Z 2022-11-23T02:32:56.0840863Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0841165Z Ran 1 test in 5.958s 2022-11-23T02:32:56.0841388Z 2022-11-23T02:32:56.0841478Z OK 2022-11-23T02:32:56.0841606Z 2022-11-23T02:32:56.0841730Z Generating XML reports... 2022-11-23T02:32:56.0842314Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023230.xml 2022-11-23T02:32:56.0843012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0843441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0843994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0844430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0844651Z 2022-11-23T02:32:56.0844755Z Running tests... 2022-11-23T02:32:56.0845140Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0845635Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0846153Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0846640Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70834 2022-11-23T02:32:56.0847070Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70835 2022-11-23T02:32:56.0847638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0848065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0848616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0849068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0849605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0850038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0850586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0851012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0851435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0851887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0852354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0852814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0853443Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0854349Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0854726Z ok (7.045s) 2022-11-23T02:32:56.0854877Z 2022-11-23T02:32:56.0855143Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0855469Z Ran 1 test in 7.045s 2022-11-23T02:32:56.0855629Z 2022-11-23T02:32:56.0855722Z OK 2022-11-23T02:32:56.0855835Z 2022-11-23T02:32:56.0855958Z Generating XML reports... 2022-11-23T02:32:56.0856580Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023238.xml 2022-11-23T02:32:56.0857306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0857732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0858302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0858835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0859062Z 2022-11-23T02:32:56.0859170Z Running tests... 2022-11-23T02:32:56.0859552Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0860084Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:32:56.0860782Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:32:56.0861281Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71060 2022-11-23T02:32:56.0861700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71061 2022-11-23T02:32:56.0862279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0862708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0863249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0863696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0864695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:32:56.0865135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:32:56.0865681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:32:56.0866135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:32:56.0866570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:32:56.0867022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:32:56.0867505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:32:56.0868010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:32:56.0868663Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0869329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:32:56.0869856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:32:56.0870507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:32:56.0871128Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:32:56.0871759Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:32:56.0872140Z ok (7.139s) 2022-11-23T02:32:56.0872357Z 2022-11-23T02:32:56.0872624Z ---------------------------------------------------------------------- 2022-11-23T02:32:56.0873111Z Ran 1 test in 7.140s 2022-11-23T02:32:56.0873271Z 2022-11-23T02:32:56.0873362Z OK 2022-11-23T02:32:56.0873495Z 2022-11-23T02:32:56.0873617Z Generating XML reports... 2022-11-23T02:32:56.0874235Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023248.xml 2022-11-23T02:32:56.0874595Z 2022-11-23T02:32:56.0874964Z ##[endgroup] 2022-11-23T02:32:56.0875529Z FINISHED PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_97po5h7m) 2022-11-23T02:32:56.0875862Z 2022-11-23T02:32:56.3984969Z 2022-11-23T02:32:56.3985379Z real 1m58.844s 2022-11-23T02:32:56.3985634Z user 4m9.213s 2022-11-23T02:32:56.3985885Z sys 3m12.511s 2022-11-23T02:32:56.3986598Z + python test/run_test.py --verbose -i distributed/rpc/cuda/test_tensorpipe_agent 2022-11-23T02:32:58.7251172Z Ignoring disabled issues: [] 2022-11-23T02:32:58.7780383Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:32:58.7780960Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:32:58.7781302Z Selected tests: 2022-11-23T02:32:58.7781676Z distributed/rpc/cuda/test_tensorpipe_agent 2022-11-23T02:32:58.7808671Z Prioritized test from test file changes. 2022-11-23T02:32:58.7809087Z reordering tests for PR: 2022-11-23T02:32:58.7809377Z prioritized: [] 2022-11-23T02:32:58.7809865Z the rest: ['distributed/rpc/cuda/test_tensorpipe_agent'] 2022-11-23T02:32:58.7810070Z 2022-11-23T02:32:58.7810622Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:32:58.7811689Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:32:58.7816474Z parallel (file granularity) tests: 2022-11-23T02:32:58.7817048Z 2022-11-23T02:32:58.7817346Z serial (file granularity) tests: 2022-11-23T02:32:58.7817656Z distributed/rpc/cuda/test_tensorpipe_agent 2022-11-23T02:33:01.1300120Z Ignoring disabled issues: [] 2022-11-23T02:33:01.5751200Z Running distributed/rpc/cuda/test_tensorpipe_agent ... [2022-11-23 02:33:01.574450] 2022-11-23T02:33:01.5752351Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/rpc/cuda/test_tensorpipe_agent.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:33:01.574905] 2022-11-23T02:55:39.4273590Z 2022-11-23T02:55:39.4274175Z Expand the folded group to see the log file of distributed/rpc/cuda/test_tensorpipe_agent 2022-11-23T02:55:39.4277200Z ##[group]PRINTING LOG FILE of distributed/rpc/cuda/test_tensorpipe_agent (/var/lib/jenkins/workspace/test/test-reports/distributed-rpc-cuda-test_tensorpipe_agent_f7xxdyjg) 2022-11-23T02:55:39.4281101Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpie681gpq 2022-11-23T02:55:39.4281665Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpie681gpq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4284687Z ]> 2022-11-23T02:55:39.4285617Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) 2022-11-23T02:55:39.4287024Z , <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation>, <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation_gpu_root>]> 2022-11-23T02:55:39.4288123Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4288929Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4289416Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4290306Z , <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_input_moved_to_cuda_device_script>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_invalid_devices>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_valid_device>]> 2022-11-23T02:55:39.4291172Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-11-23T02:55:39.4291639Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) 2022-11-23T02:55:39.4292095Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) 2022-11-23T02:55:39.4292497Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-11-23T02:55:39.4293134Z ]> 2022-11-23T02:55:39.4293593Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) 2022-11-23T02:55:39.4294870Z , <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never_find_unused>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_always>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never_find_unused>]> 2022-11-23T02:55:39.4296107Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4296539Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4296946Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4297375Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4297799Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4298214Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4298622Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4299048Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-11-23T02:55:39.4314889Z , <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_async_execution_with_cuda_future>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_callback_changes_devices>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_int>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_str>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_not_cuda>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_modify_tensor_inplace>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_replace_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_value_on_bad_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default_to_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default_to_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_in_options>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_local_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_remote_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_min_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_many_to_one>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_not_timeout>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_one_to_many>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_wrong_worker_name>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch_reverse>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_with_unpickleable_attributes>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_tensor_view_as_return_value>]> 2022-11-23T02:55:39.4328875Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4329435Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4329973Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4330524Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4331033Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4331598Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4332921Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4333508Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4334075Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4334608Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4335158Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4335664Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4336163Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4336663Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4337180Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4337703Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4338197Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4338650Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4339140Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4339703Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4340163Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4340664Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4341184Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4341691Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4342183Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4342691Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4343168Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4343627Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4344921Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4345755Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4346269Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4346723Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4347186Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4347662Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4348133Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4348626Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4349116Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4349597Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4350068Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4350550Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4351034Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4351523Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4352015Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4352529Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4353041Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4353517Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4353989Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4354584Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4355122Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4355614Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4356113Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4356598Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4357077Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4357599Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4358166Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4358807Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4359610Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4360180Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4360761Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4361375Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4361954Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4362451Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4363107Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4363720Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4364247Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4364959Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4365522Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4366111Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4366711Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4367375Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4367984Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4368555Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4393763Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4394299Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4394822Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4395342Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4395838Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4396359Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4396879Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4397390Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4397866Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4398381Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4398876Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4399354Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4400005Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4400578Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4401098Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-11-23T02:55:39.4401988Z , <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_dist_autograd_sync_streams>, <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_gradients_synchronizations>]> 2022-11-23T02:55:39.4402908Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4403441Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4403973Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-11-23T02:55:39.4404839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4405306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4405895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4406378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4406831Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjfjx0mx5 2022-11-23T02:55:39.4407384Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjfjx0mx5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4407688Z 2022-11-23T02:55:39.4407805Z Running tests... 2022-11-23T02:55:39.4408201Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4408792Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4409397Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4409936Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71506 2022-11-23T02:55:39.4410382Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71507 2022-11-23T02:55:39.4410837Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71508 2022-11-23T02:55:39.4411290Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71509 2022-11-23T02:55:39.4411906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4412346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4412935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4413430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4413998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4414456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4415036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4415513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4416076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4416532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4417106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4417640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4418220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4418693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4419268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4419718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4420192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6t0rf017 2022-11-23T02:55:39.4420738Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6t0rf017/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4421276Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe_qhqgo4 2022-11-23T02:55:39.4421792Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe_qhqgo4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4422386Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2m39ba5q 2022-11-23T02:55:39.4423041Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2m39ba5q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4423580Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp38kjf_bk 2022-11-23T02:55:39.4424679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp38kjf_bk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4425616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4426281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4426723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.4427200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.4427622Z fi_getinfo: -61 2022-11-23T02:55:39.4427905Z fi_getinfo: -61 2022-11-23T02:55:39.4428164Z fi_getinfo: -61 2022-11-23T02:55:39.4428438Z fi_getinfo: -61 2022-11-23T02:55:39.4428823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4429306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4429804Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:55:39.4430460Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:55:39.4431149Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:55:39.4431658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:55:39.4432312Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:55:39.4432988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:55:39.4433515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4433981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4434464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4434941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4435397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4435874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4436443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4436929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4437259Z ok (7.638s) 2022-11-23T02:55:39.4437409Z 2022-11-23T02:55:39.4437685Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4438018Z Ran 1 test in 7.638s 2022-11-23T02:55:39.4438180Z 2022-11-23T02:55:39.4438254Z OK 2022-11-23T02:55:39.4438388Z 2022-11-23T02:55:39.4438511Z Generating XML reports... 2022-11-23T02:55:39.4439192Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20221123023306.xml 2022-11-23T02:55:39.4439970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4440400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4441087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4441560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4442007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps6kmg768 2022-11-23T02:55:39.4442544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps6kmg768/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4442841Z 2022-11-23T02:55:39.4442951Z Running tests... 2022-11-23T02:55:39.4443361Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4443927Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4444473Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4444970Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72173 2022-11-23T02:55:39.4445411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72174 2022-11-23T02:55:39.4445866Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 72175 2022-11-23T02:55:39.4446311Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 72176 2022-11-23T02:55:39.4446921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4447352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4447924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4448395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4448951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4449399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4449970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4450436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4450991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4451438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4452007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4452467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4453021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4453463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4454081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4454530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4454997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw79dlwtt 2022-11-23T02:55:39.4455543Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw79dlwtt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4456080Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzu7nrrk_ 2022-11-23T02:55:39.4456597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzu7nrrk_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4457127Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4u6ejvj7 2022-11-23T02:55:39.4457660Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4u6ejvj7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4458222Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpagi20zxc 2022-11-23T02:55:39.4458756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpagi20zxc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4459263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.4459734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.4460182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4460645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4461035Z fi_getinfo: -61 2022-11-23T02:55:39.4461291Z fi_getinfo: -61 2022-11-23T02:55:39.4461562Z fi_getinfo: -61 2022-11-23T02:55:39.4461831Z fi_getinfo: -61 2022-11-23T02:55:39.4462046Z ok (6.802s) 2022-11-23T02:55:39.4462193Z 2022-11-23T02:55:39.4462462Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4462845Z Ran 1 test in 6.802s 2022-11-23T02:55:39.4463018Z 2022-11-23T02:55:39.4463113Z OK 2022-11-23T02:55:39.4463227Z 2022-11-23T02:55:39.4463352Z Generating XML reports... 2022-11-23T02:55:39.4464271Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023316.xml 2022-11-23T02:55:39.4465054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4465487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4466063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4466541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4467004Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2xon0vef 2022-11-23T02:55:39.4467575Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2xon0vef/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4467879Z 2022-11-23T02:55:39.4467990Z Running tests... 2022-11-23T02:55:39.4468402Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4468984Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4469534Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4470049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72828 2022-11-23T02:55:39.4470501Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72829 2022-11-23T02:55:39.4470929Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 72830 2022-11-23T02:55:39.4471368Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 72831 2022-11-23T02:55:39.4472055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4472518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4473075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4473543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4474120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4474546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4475114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4475576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4476151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4476642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4477215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4477678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4478231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4478670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4479237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4479700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4480144Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ak1j0kj 2022-11-23T02:55:39.4480698Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ak1j0kj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4481229Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsuful909 2022-11-23T02:55:39.4481766Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsuful909/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4482275Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps8gdcsio 2022-11-23T02:55:39.4482810Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps8gdcsio/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4483341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzkszsqwn 2022-11-23T02:55:39.4483860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzkszsqwn/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4484371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4484850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.4485323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.4485771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4486156Z fi_getinfo: -61 2022-11-23T02:55:39.4486434Z fi_getinfo: -61 2022-11-23T02:55:39.4486686Z fi_getinfo: -61 2022-11-23T02:55:39.4486956Z fi_getinfo: -61 2022-11-23T02:55:39.4487191Z ok (6.957s) 2022-11-23T02:55:39.4487340Z 2022-11-23T02:55:39.4487590Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4487923Z Ran 1 test in 6.957s 2022-11-23T02:55:39.4488086Z 2022-11-23T02:55:39.4488179Z OK 2022-11-23T02:55:39.4488313Z 2022-11-23T02:55:39.4488437Z Generating XML reports... 2022-11-23T02:55:39.4489092Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023326.xml 2022-11-23T02:55:39.4489917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4490376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4490934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4491405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4491872Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp61jvul5_ 2022-11-23T02:55:39.4492411Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp61jvul5_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4492710Z 2022-11-23T02:55:39.4492801Z Running tests... 2022-11-23T02:55:39.4493206Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4493897Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4494457Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4494985Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73483 2022-11-23T02:55:39.4495438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73484 2022-11-23T02:55:39.4495885Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 73485 2022-11-23T02:55:39.4496312Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 73486 2022-11-23T02:55:39.4496908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4497360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4497938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4498399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4498981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4499429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4499981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4500448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4501025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4501468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4502018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4502490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4503068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4503495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4504269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4504736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4505199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1fqu1682 2022-11-23T02:55:39.4505716Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1fqu1682/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4506249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrgffbyk 2022-11-23T02:55:39.4506786Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrgffbyk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4507411Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkljgn4a4 2022-11-23T02:55:39.4507936Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkljgn4a4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4508467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpobmml6uu 2022-11-23T02:55:39.4508999Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpobmml6uu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4509491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4509962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.4510429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4510895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.4511341Z fi_getinfo: -61 2022-11-23T02:55:39.4511624Z fi_getinfo: -61 2022-11-23T02:55:39.4511899Z fi_getinfo: -61 2022-11-23T02:55:39.4512150Z fi_getinfo: -61 2022-11-23T02:55:39.4512384Z ok (6.916s) 2022-11-23T02:55:39.4512533Z 2022-11-23T02:55:39.4512802Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4513114Z Ran 1 test in 6.916s 2022-11-23T02:55:39.4513277Z 2022-11-23T02:55:39.4513369Z OK 2022-11-23T02:55:39.4513503Z 2022-11-23T02:55:39.4513630Z Generating XML reports... 2022-11-23T02:55:39.4514288Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023336.xml 2022-11-23T02:55:39.4515051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4515506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4516084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4516533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4516999Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3webid9e 2022-11-23T02:55:39.4517542Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3webid9e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4517844Z 2022-11-23T02:55:39.4517953Z Running tests... 2022-11-23T02:55:39.4518343Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4518920Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4519485Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4519981Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74138 2022-11-23T02:55:39.4520445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74139 2022-11-23T02:55:39.4521047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4521498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4522051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4522524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4523101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4523667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4524218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4524687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4525206Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpseylyxbh 2022-11-23T02:55:39.4525733Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpseylyxbh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4526269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptpfgi2ab 2022-11-23T02:55:39.4526807Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptpfgi2ab/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4527318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4527765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4528134Z fi_getinfo: -61 2022-11-23T02:55:39.4528408Z fi_getinfo: -61 2022-11-23T02:55:39.4528621Z ok (6.176s) 2022-11-23T02:55:39.4528769Z 2022-11-23T02:55:39.4529040Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4529425Z Ran 1 test in 6.177s 2022-11-23T02:55:39.4529588Z 2022-11-23T02:55:39.4529661Z OK 2022-11-23T02:55:39.4529795Z 2022-11-23T02:55:39.4529919Z Generating XML reports... 2022-11-23T02:55:39.4530594Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023345.xml 2022-11-23T02:55:39.4531350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4531776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4532348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4532812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4533255Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzx28rbep 2022-11-23T02:55:39.4533793Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzx28rbep/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4534084Z 2022-11-23T02:55:39.4534181Z Running tests... 2022-11-23T02:55:39.4534561Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4535122Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4535694Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4536208Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74424 2022-11-23T02:55:39.4536661Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74425 2022-11-23T02:55:39.4537250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4537700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4538261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4538705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4539267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4539700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4540272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4540720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4541186Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp95a73mn6 2022-11-23T02:55:39.4541724Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95a73mn6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4542292Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_7gp_t2 2022-11-23T02:55:39.4542837Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_7gp_t2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4543342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4543813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4544387Z fi_getinfo: -61 2022-11-23T02:55:39.4544662Z fi_getinfo: -61 2022-11-23T02:55:39.4545165Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_7gp_t2/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-11-23T02:55:39.4545891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95a73mn6/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-11-23T02:55:39.4546689Z INFO:torch.distributed.nn.jit.instantiator:Skipped writing /tmp/tmp95a73mn6/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-11-23T02:55:39.4547166Z ok (6.422s) 2022-11-23T02:55:39.4547318Z 2022-11-23T02:55:39.4547592Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4547903Z Ran 1 test in 6.423s 2022-11-23T02:55:39.4548066Z 2022-11-23T02:55:39.4548160Z OK 2022-11-23T02:55:39.4548294Z 2022-11-23T02:55:39.4548419Z Generating XML reports... 2022-11-23T02:55:39.4549092Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023354.xml 2022-11-23T02:55:39.4549840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4550291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4550866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4551342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4551794Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn6g9nx6k 2022-11-23T02:55:39.4552331Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn6g9nx6k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4552628Z 2022-11-23T02:55:39.4552739Z Running tests... 2022-11-23T02:55:39.4553125Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4553706Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4554258Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4554760Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74742 2022-11-23T02:55:39.4555196Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74743 2022-11-23T02:55:39.4555799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4556250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4556802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4557271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4557844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4558292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4558841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4559309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4559834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf3cqa_vt 2022-11-23T02:55:39.4560363Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf3cqa_vt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4560895Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9cgbwggx 2022-11-23T02:55:39.4561433Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9cgbwggx/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4561941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4562392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4562822Z fi_getinfo: -61 2022-11-23T02:55:39.4563110Z fi_getinfo: -61 2022-11-23T02:55:39.4563359Z On WorkerInfo(id=1, name=worker1): 2022-11-23T02:55:39.4581735Z RuntimeError('CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python)\nframe #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python)\nframe #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python)\nframe #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-11-23T02:55:39.4592501Z Traceback (most recent call last): 2022-11-23T02:55:39.4593040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.4593486Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.4594066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 92, in _create_module 2022-11-23T02:55:39.4594452Z module.to(device) 2022-11-23T02:55:39.4594904Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1120, in to 2022-11-23T02:55:39.4595250Z return self._apply(convert) 2022-11-23T02:55:39.4595722Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 795, in _apply 2022-11-23T02:55:39.4596091Z param_applied = fn(param) 2022-11-23T02:55:39.4596545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1118, in convert 2022-11-23T02:55:39.4597006Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-11-23T02:55:39.4597408Z RuntimeError: CUDA error: invalid device ordinal 2022-11-23T02:55:39.4597862Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-11-23T02:55:39.4598292Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-11-23T02:55:39.4598776Z Exception raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first): 2022-11-23T02:55:39.4599635Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4600757Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4601669Z frame #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4602358Z frame #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4602973Z frame #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4603782Z frame #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4604405Z frame #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4605425Z frame #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4606327Z frame #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4607268Z frame #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4608201Z frame #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4609274Z frame #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4610091Z frame #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4611105Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4611952Z frame #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4613113Z frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4613922Z frame #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4614517Z frame #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4615423Z frame #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4616488Z frame #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4617541Z frame #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4618530Z frame #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4619338Z frame #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4619990Z frame #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4620609Z frame #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python) 2022-11-23T02:55:39.4620995Z frame #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python) 2022-11-23T02:55:39.4621412Z frame #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4621968Z frame #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python) 2022-11-23T02:55:39.4622361Z frame #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4622753Z frame #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4623124Z frame #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4623517Z frame #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4624213Z frame #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4624601Z frame #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4625000Z frame #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4625398Z frame #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4625823Z frame #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python) 2022-11-23T02:55:39.4626220Z frame #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4626613Z frame #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4627226Z frame #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4628155Z frame #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4629124Z frame #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4630213Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4631560Z frame #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4632812Z frame #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4633764Z frame #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4634832Z frame #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4636059Z frame #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4636838Z frame #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4637512Z frame #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4638245Z frame #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.4638762Z frame #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.4639236Z frame #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.4639452Z 2022-11-23T02:55:39.4639470Z 2022-11-23T02:55:39.4639601Z On WorkerInfo(id=1, name=worker1): 2022-11-23T02:55:39.4678408Z RuntimeError('On WorkerInfo(id=1, name=worker1):\nRuntimeError(\'CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python)\nframe #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python)\nframe #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python)\nframe #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6)\n\')\nTraceback (most recent call last):\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function\n result = python_udf.func(*python_udf.args, **python_udf.kwargs)\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 92, in _create_module\n module.to(device)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1120, in to\n return self._apply(convert)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 795, in _apply\n param_applied = fn(param)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1118, in convert\n return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)\nRuntimeError: CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python)\nframe #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python)\nframe #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python)\nframe #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python)\nframe #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python)\nframe #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python)\nframe #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python)\nframe #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6)\n\n') 2022-11-23T02:55:39.4700915Z Traceback (most recent call last): 2022-11-23T02:55:39.4701443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.4701881Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.4702291Z File "/tmp/tmpn6g9nx6k/_remote_module_non_scriptable.py", line 47, in _remote_forward 2022-11-23T02:55:39.4702630Z module = module_rref.local_value() 2022-11-23T02:55:39.4703146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 236, in _handle_exception 2022-11-23T02:55:39.4703505Z raise exc 2022-11-23T02:55:39.4704139Z RuntimeError: On WorkerInfo(id=1, name=worker1): 2022-11-23T02:55:39.4704634Z RuntimeError('CUDA error: invalid device ordinal 2022-11-23T02:55:39.4705091Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-11-23T02:55:39.4705546Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-11-23T02:55:39.4706016Z Exception raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first): 2022-11-23T02:55:39.4706879Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4708007Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4708866Z frame #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4709515Z frame #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4710135Z frame #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4710760Z frame #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4711378Z frame #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4712345Z frame #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4713150Z frame #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4714062Z frame #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4714826Z frame #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4715733Z frame #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4716577Z frame #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4717546Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4718353Z frame #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4719320Z frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4720184Z frame #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4720796Z frame #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4721692Z frame #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4722959Z frame #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4723764Z frame #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4724732Z frame #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4725547Z frame #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4726176Z frame #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4726641Z frame #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python) 2022-11-23T02:55:39.4727040Z frame #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python) 2022-11-23T02:55:39.4727438Z frame #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4727803Z frame #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python) 2022-11-23T02:55:39.4728348Z frame #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4728728Z frame #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4729105Z frame #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4729460Z frame #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4729834Z frame #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4730208Z frame #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4730566Z frame #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4730992Z frame #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4731404Z frame #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python) 2022-11-23T02:55:39.4731803Z frame #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4732161Z frame #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4732741Z frame #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4733501Z frame #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4734475Z frame #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4735777Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4736994Z frame #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4738257Z frame #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4739282Z frame #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4740178Z frame #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4741384Z frame #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4742154Z frame #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4742835Z frame #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4743347Z frame #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.4744461Z frame #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.4745311Z frame #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.4745888Z ') 2022-11-23T02:55:39.4746329Z Traceback (most recent call last): 2022-11-23T02:55:39.4747301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.4748102Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.4748663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 92, in _create_module 2022-11-23T02:55:39.4749037Z module.to(device) 2022-11-23T02:55:39.4749550Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1120, in to 2022-11-23T02:55:39.4749919Z return self._apply(convert) 2022-11-23T02:55:39.4750381Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 795, in _apply 2022-11-23T02:55:39.4750718Z param_applied = fn(param) 2022-11-23T02:55:39.4751177Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1118, in convert 2022-11-23T02:55:39.4751628Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-11-23T02:55:39.4752013Z RuntimeError: CUDA error: invalid device ordinal 2022-11-23T02:55:39.4752431Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-11-23T02:55:39.4752866Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-11-23T02:55:39.4753522Z Exception raised from c10_cuda_check_implementation at /var/lib/jenkins/workspace/c10/cuda/CUDAException.cpp:31 (most recent call first): 2022-11-23T02:55:39.4754454Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f801cce259b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4755435Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f801ccdddfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4756635Z frame #2: c10::cuda::c10_cuda_check_implementation(char const*, char const*, int, bool) + 0x42e (0x7f801cf6806e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4757322Z frame #3: + 0x17c9d (0x7f801cf40c9d in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-11-23T02:55:39.4757961Z frame #4: + 0xdf3ecd (0x7f801df80ecd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4758595Z frame #5: + 0x29c7ab5 (0x7f801fb54ab5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4759235Z frame #6: + 0x29c7c5b (0x7f801fb54c5b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.4760252Z frame #7: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7f8029d57cb3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4761228Z frame #8: + 0x20c43b5 (0x7f802a0793b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4762143Z frame #9: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7f8029d92e58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4763154Z frame #10: + 0x12691af (0x7f802921e1af in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4764101Z frame #11: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x1321 (0x7f80295b4641 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4764898Z frame #12: + 0x22a6e23 (0x7f802a25be23 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4765952Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4766935Z frame #14: + 0x20c8908 (0x7f802a07d908 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4767940Z frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7f8029a780e3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4768745Z frame #16: + 0x3443e11 (0x7f802b3f8e11 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4769545Z frame #17: + 0x34443bb (0x7f802b3f93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4770555Z frame #18: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7f8029ad1de1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4771653Z frame #19: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7f80295ac05e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4772587Z frame #20: + 0x2471e09 (0x7f802a426e09 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4773502Z frame #21: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7f8029c433f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4774297Z frame #22: + 0x36467f (0x7f8034cf867f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4774922Z frame #23: + 0x364b3c (0x7f8034cf8b3c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4775366Z frame #24: + 0x1ddc68 (0x55da20835c68 in /opt/conda/bin/python) 2022-11-23T02:55:39.4775735Z frame #25: + 0x1049f3 (0x55da2075c9f3 in /opt/conda/bin/python) 2022-11-23T02:55:39.4776112Z frame #26: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4776485Z frame #27: + 0x104425 (0x55da2075c425 in /opt/conda/bin/python) 2022-11-23T02:55:39.4776842Z frame #28: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4777225Z frame #29: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4777608Z frame #30: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4777986Z frame #31: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4778341Z frame #32: + 0x18fc9b (0x55da207e7c9b in /opt/conda/bin/python) 2022-11-23T02:55:39.4778713Z frame #33: + 0x1052a5 (0x55da2075d2a5 in /opt/conda/bin/python) 2022-11-23T02:55:39.4779088Z frame #34: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4779443Z frame #35: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4779842Z frame #36: _PyEval_EvalFrameDefault + 0x26e4 (0x55da2083b774 in /opt/conda/bin/python) 2022-11-23T02:55:39.4780241Z frame #37: + 0x18f742 (0x55da207e7742 in /opt/conda/bin/python) 2022-11-23T02:55:39.4780612Z frame #38: _PyObject_Call + 0x20a (0x55da2079ffaa in /opt/conda/bin/python) 2022-11-23T02:55:39.4781408Z frame #39: + 0xaa8dba (0x7f803543cdba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4782207Z frame #40: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f803543affd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4783205Z frame #41: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f803543e2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4785089Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7f803543e9a3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4786392Z frame #43: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7f802c9e0654 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4787653Z frame #44: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f803543e0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.4788841Z frame #45: + 0x4a24a53 (0x7f802c9d9a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4789771Z frame #46: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f802c9da5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4790828Z frame #47: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f802c9d48e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4791732Z frame #48: + 0x4a545d2 (0x7f802ca095d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.4792386Z frame #49: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f801ccd090b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.4792858Z frame #50: + 0xdbbf4 (0x7f804cb36bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.4793386Z frame #51: + 0x76db (0x7f806d18a6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.4793871Z frame #52: clone + 0x3f (0x7f806ceb361f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.4794089Z 2022-11-23T02:55:39.4794107Z 2022-11-23T02:55:39.4794125Z 2022-11-23T02:55:39.4794225Z ok (4.824s) 2022-11-23T02:55:39.4794349Z 2022-11-23T02:55:39.4794616Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4794939Z Ran 1 test in 4.824s 2022-11-23T02:55:39.4795273Z 2022-11-23T02:55:39.4795368Z OK 2022-11-23T02:55:39.4795501Z 2022-11-23T02:55:39.4795607Z Generating XML reports... 2022-11-23T02:55:39.4796285Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023404.xml 2022-11-23T02:55:39.4797050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4797503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4798219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4798750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4799214Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwk2jzo64 2022-11-23T02:55:39.4799714Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwk2jzo64/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4800005Z 2022-11-23T02:55:39.4800111Z Running tests... 2022-11-23T02:55:39.4800509Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4801067Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4801573Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4802047Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75027 2022-11-23T02:55:39.4802480Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75028 2022-11-23T02:55:39.4803130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4803547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4804273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4804743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4805302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4805743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4806314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4806772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4807382Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppii4k6cb 2022-11-23T02:55:39.4807908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppii4k6cb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4808424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoaxcve5g 2022-11-23T02:55:39.4808921Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoaxcve5g/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4809418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4809871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4810251Z fi_getinfo: -61 2022-11-23T02:55:39.4810504Z fi_getinfo: -61 2022-11-23T02:55:39.4810737Z ok (6.181s) 2022-11-23T02:55:39.4810885Z 2022-11-23T02:55:39.4811150Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4811459Z Ran 1 test in 6.182s 2022-11-23T02:55:39.4811617Z 2022-11-23T02:55:39.4811710Z OK 2022-11-23T02:55:39.4811842Z 2022-11-23T02:55:39.4811961Z Generating XML reports... 2022-11-23T02:55:39.4812590Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023411.xml 2022-11-23T02:55:39.4813336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4813772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4814325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4814761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4815209Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc56tpxmq 2022-11-23T02:55:39.4815732Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc56tpxmq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4816072Z 2022-11-23T02:55:39.4816184Z Running tests... 2022-11-23T02:55:39.4816565Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4817123Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4817647Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4818102Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75313 2022-11-23T02:55:39.4818537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75314 2022-11-23T02:55:39.4818964Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75315 2022-11-23T02:55:39.4819390Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75316 2022-11-23T02:55:39.4819963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4820452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4821015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4821452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4822018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4822639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4823310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4823779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4824562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4825012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4825576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4826021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4826596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4827033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4827570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4828031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4828667Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgellkinp 2022-11-23T02:55:39.4829197Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgellkinp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4829697Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3tp6bvtc 2022-11-23T02:55:39.4830215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3tp6bvtc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4830732Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6q2p8t3s 2022-11-23T02:55:39.4831228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6q2p8t3s/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4831740Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphbq7gwk6 2022-11-23T02:55:39.4832254Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphbq7gwk6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4832742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4833180Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4833726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.4834202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.4834586Z fi_getinfo: -61 2022-11-23T02:55:39.4834837Z fi_getinfo: -61 2022-11-23T02:55:39.4835095Z fi_getinfo: -61 2022-11-23T02:55:39.4835355Z fi_getinfo: -61 2022-11-23T02:55:39.4835563Z ok (9.247s) 2022-11-23T02:55:39.4835705Z 2022-11-23T02:55:39.4836139Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4836471Z Ran 1 test in 9.248s 2022-11-23T02:55:39.4836613Z 2022-11-23T02:55:39.4836706Z OK 2022-11-23T02:55:39.4836838Z 2022-11-23T02:55:39.4836965Z Generating XML reports... 2022-11-23T02:55:39.4837600Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20221123023420.xml 2022-11-23T02:55:39.4838396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4839010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4839566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4840017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4840447Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmh6n0f_d 2022-11-23T02:55:39.4840968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmh6n0f_d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4841257Z 2022-11-23T02:55:39.4841542Z Running tests... 2022-11-23T02:55:39.4841950Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4842510Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4843062Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4843566Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75820 2022-11-23T02:55:39.4843999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75821 2022-11-23T02:55:39.4844751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4845187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4845928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4846383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4846959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4847404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4847953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4848420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4848884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_nq959pa 2022-11-23T02:55:39.4849419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_nq959pa/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4849932Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgdb8t976 2022-11-23T02:55:39.4850615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgdb8t976/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4851104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4851737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4852118Z fi_getinfo: -61 2022-11-23T02:55:39.4852447Z fi_getinfo: -61 2022-11-23T02:55:39.4852835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4853315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4853970Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4854662Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4855306Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4855897Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4856776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4857272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4857612Z ok (8.630s) 2022-11-23T02:55:39.4857764Z 2022-11-23T02:55:39.4858040Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4858379Z Ran 1 test in 8.630s 2022-11-23T02:55:39.4858542Z 2022-11-23T02:55:39.4858636Z OK 2022-11-23T02:55:39.4858749Z 2022-11-23T02:55:39.4858875Z Generating XML reports... 2022-11-23T02:55:39.4859530Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023432.xml 2022-11-23T02:55:39.4860284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4860715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4861292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4861762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4862231Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6urxssd4 2022-11-23T02:55:39.4862754Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6urxssd4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4863116Z 2022-11-23T02:55:39.4863233Z Running tests... 2022-11-23T02:55:39.4863648Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4864405Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4864965Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4865469Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76127 2022-11-23T02:55:39.4865925Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76128 2022-11-23T02:55:39.4866518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4866967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4867729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4868190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4868728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4869339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4869910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4870363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4870907Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzf5vvli9 2022-11-23T02:55:39.4871454Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzf5vvli9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4871988Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfslsvz1k 2022-11-23T02:55:39.4872654Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfslsvz1k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4873143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4873596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4873955Z fi_getinfo: -61 2022-11-23T02:55:39.4874225Z fi_getinfo: -61 2022-11-23T02:55:39.4874594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4875143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4875759Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4876418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4877034Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4877601Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4878084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4878553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4878893Z ok (8.528s) 2022-11-23T02:55:39.4879018Z 2022-11-23T02:55:39.4879281Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4879604Z Ran 1 test in 8.528s 2022-11-23T02:55:39.4879764Z 2022-11-23T02:55:39.4879860Z OK 2022-11-23T02:55:39.4879996Z 2022-11-23T02:55:39.4880098Z Generating XML reports... 2022-11-23T02:55:39.4880739Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023444.xml 2022-11-23T02:55:39.4881636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4882090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4882646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4883116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4883583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpctfmz077 2022-11-23T02:55:39.4884290Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpctfmz077/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4884562Z 2022-11-23T02:55:39.4884672Z Running tests... 2022-11-23T02:55:39.4885063Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4885799Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4886325Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4886819Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76434 2022-11-23T02:55:39.4887270Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76435 2022-11-23T02:55:39.4887873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4888632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4889256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4889731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4890286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4890737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4891311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4891777Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4892221Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpteezr1os 2022-11-23T02:55:39.4892760Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpteezr1os/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4893349Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2zb5fapi 2022-11-23T02:55:39.4893864Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2zb5fapi/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4894371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4894841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4895232Z fi_getinfo: -61 2022-11-23T02:55:39.4895492Z fi_getinfo: -61 2022-11-23T02:55:39.4895879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4896384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4897020Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4897818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4898474Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4899090Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4899574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4900067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4900588Z ok (8.412s) 2022-11-23T02:55:39.4900739Z 2022-11-23T02:55:39.4901007Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4901311Z Ran 1 test in 8.412s 2022-11-23T02:55:39.4901474Z 2022-11-23T02:55:39.4901569Z OK 2022-11-23T02:55:39.4901703Z 2022-11-23T02:55:39.4901827Z Generating XML reports... 2022-11-23T02:55:39.4902455Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023455.xml 2022-11-23T02:55:39.4903185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4903624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4904563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4905016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4905485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpekwh494a 2022-11-23T02:55:39.4906022Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpekwh494a/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4906318Z 2022-11-23T02:55:39.4906408Z Running tests... 2022-11-23T02:55:39.4906818Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4907466Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4908195Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4908666Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76741 2022-11-23T02:55:39.4909277Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76742 2022-11-23T02:55:39.4909884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4910315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4910885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4911349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4911993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4912417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4912984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4913597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4914044Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphemqiam6 2022-11-23T02:55:39.4914546Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphemqiam6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4915062Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeygd465e 2022-11-23T02:55:39.4915577Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeygd465e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4916056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4916509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4917044Z fi_getinfo: -61 2022-11-23T02:55:39.4917320Z fi_getinfo: -61 2022-11-23T02:55:39.4917677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4918171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4918825Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4919486Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4920128Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4920892Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4921234Z ok (8.435s) 2022-11-23T02:55:39.4921361Z 2022-11-23T02:55:39.4921619Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4921938Z Ran 1 test in 8.435s 2022-11-23T02:55:39.4922093Z 2022-11-23T02:55:39.4922354Z OK 2022-11-23T02:55:39.4922486Z 2022-11-23T02:55:39.4922592Z Generating XML reports... 2022-11-23T02:55:39.4923245Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023506.xml 2022-11-23T02:55:39.4923989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4924433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4924986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4925502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4925974Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxaqbldaq 2022-11-23T02:55:39.4926496Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxaqbldaq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4926798Z 2022-11-23T02:55:39.4926908Z Running tests... 2022-11-23T02:55:39.4927314Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4927888Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4928412Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4929063Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77048 2022-11-23T02:55:39.4929498Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77049 2022-11-23T02:55:39.4930145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4930562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4931117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4931568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4932295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4932733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4933297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4933762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4934216Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphtzy0g71 2022-11-23T02:55:39.4934910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphtzy0g71/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4935422Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpabjno8s7 2022-11-23T02:55:39.4935922Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpabjno8s7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4936598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4937070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4937458Z fi_getinfo: -61 2022-11-23T02:55:39.4937718Z fi_getinfo: -61 2022-11-23T02:55:39.4938092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4938586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4939385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4940050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4940672Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4941250Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4941707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4942362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4942715Z ok (9.928s) 2022-11-23T02:55:39.4942862Z 2022-11-23T02:55:39.4943134Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4943446Z Ran 1 test in 9.929s 2022-11-23T02:55:39.4943606Z 2022-11-23T02:55:39.4943745Z OK 2022-11-23T02:55:39.4944069Z 2022-11-23T02:55:39.4944202Z Generating XML reports... 2022-11-23T02:55:39.4944998Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023518.xml 2022-11-23T02:55:39.4945895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4946343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4946917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4947369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4947833Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy41vno2d 2022-11-23T02:55:39.4948490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy41vno2d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4948943Z 2022-11-23T02:55:39.4949030Z Running tests... 2022-11-23T02:55:39.4949420Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4949976Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4950513Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4950982Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77366 2022-11-23T02:55:39.4951417Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77367 2022-11-23T02:55:39.4951999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4952415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4952975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4953430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4954168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4954595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4955164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4955626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4956090Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7jonq4_ 2022-11-23T02:55:39.4956611Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7jonq4_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4957478Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpet51judr 2022-11-23T02:55:39.4958017Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpet51judr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4958507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4958972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4959356Z fi_getinfo: -61 2022-11-23T02:55:39.4959629Z fi_getinfo: -61 2022-11-23T02:55:39.4959988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4960481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4961289Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4961934Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4962639Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4963469Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4963962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4964430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4964780Z ok (9.858s) 2022-11-23T02:55:39.4964928Z 2022-11-23T02:55:39.4965197Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4965510Z Ran 1 test in 9.858s 2022-11-23T02:55:39.4965671Z 2022-11-23T02:55:39.4965762Z OK 2022-11-23T02:55:39.4965895Z 2022-11-23T02:55:39.4966017Z Generating XML reports... 2022-11-23T02:55:39.4966675Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023530.xml 2022-11-23T02:55:39.4967654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4968087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4968640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4969076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4969526Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxg8zbxxz 2022-11-23T02:55:39.4970228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxg8zbxxz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4970527Z 2022-11-23T02:55:39.4970639Z Running tests... 2022-11-23T02:55:39.4971029Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4971612Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4972160Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4972658Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77684 2022-11-23T02:55:39.4973235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77685 2022-11-23T02:55:39.4973812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4974247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4974782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4975233Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4975787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4976221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4976748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4977195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4977640Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp83n3w510 2022-11-23T02:55:39.4978139Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp83n3w510/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4978649Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxqb9unyb 2022-11-23T02:55:39.4979170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxqb9unyb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4979659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4980151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4980533Z fi_getinfo: -61 2022-11-23T02:55:39.4980800Z fi_getinfo: -61 2022-11-23T02:55:39.4981145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4981799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4982458Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4983142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4983763Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4984574Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4985154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4985645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:55:39.4985730Z ok (9.925s) 2022-11-23T02:55:39.4985751Z 2022-11-23T02:55:39.4986022Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4986134Z Ran 1 test in 9.925s 2022-11-23T02:55:39.4986153Z 2022-11-23T02:55:39.4986245Z OK 2022-11-23T02:55:39.4986264Z 2022-11-23T02:55:39.4986389Z Generating XML reports... 2022-11-23T02:55:39.4986885Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023543.xml 2022-11-23T02:55:39.4987257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4987437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4987798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4987995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4988254Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9988qale 2022-11-23T02:55:39.4988678Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9988qale/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4988697Z 2022-11-23T02:55:39.4988969Z Running tests... 2022-11-23T02:55:39.4989235Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4989590Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.4989875Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.4990100Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78002 2022-11-23T02:55:39.4990304Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78003 2022-11-23T02:55:39.4990674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4990851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4991229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4991421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4991940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4992108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4992467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4992693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4992953Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw9gumzds 2022-11-23T02:55:39.4993215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw9gumzds/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4993461Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5agow9s_ 2022-11-23T02:55:39.4993719Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5agow9s_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4993938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.4994157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.4994300Z fi_getinfo: -61 2022-11-23T02:55:39.4994430Z fi_getinfo: -61 2022-11-23T02:55:39.4994696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:55:39.4994934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:55:39.4995326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4995890Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:55:39.4996242Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4996585Z [W logger.cpp:318] Warning: Cuda time stats are not collected for multi-device modules. (function operator()) 2022-11-23T02:55:39.4996687Z ok (9.943s) 2022-11-23T02:55:39.4996707Z 2022-11-23T02:55:39.4996973Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.4997072Z Ran 1 test in 9.943s 2022-11-23T02:55:39.4997109Z 2022-11-23T02:55:39.4997183Z OK 2022-11-23T02:55:39.4997205Z 2022-11-23T02:55:39.4997329Z Generating XML reports... 2022-11-23T02:55:39.4997826Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023556.xml 2022-11-23T02:55:39.4998199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.4998534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.4998907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.4999092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.4999341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzt0zbl9u 2022-11-23T02:55:39.4999587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzt0zbl9u/_remote_module_non_scriptable.py 2022-11-23T02:55:39.4999610Z 2022-11-23T02:55:39.4999721Z Running tests... 2022-11-23T02:55:39.4999979Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5000326Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5000642Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5000853Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78320 2022-11-23T02:55:39.5001064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78321 2022-11-23T02:55:39.5001272Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78322 2022-11-23T02:55:39.5001461Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78323 2022-11-23T02:55:39.5001821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5002040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5002416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5002603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5002952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5003121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5003480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5003662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5003991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5004210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5004577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5004761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5005297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5005470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5005839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5006030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5006292Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps0uijn91 2022-11-23T02:55:39.5006549Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps0uijn91/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5006801Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr5fhoinw 2022-11-23T02:55:39.5007073Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr5fhoinw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5007322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9nohsjrt 2022-11-23T02:55:39.5007588Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9nohsjrt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5007838Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1v_g3qss 2022-11-23T02:55:39.5008251Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1v_g3qss/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5008473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5008679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5008898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5009115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5009261Z fi_getinfo: -61 2022-11-23T02:55:39.5009393Z fi_getinfo: -61 2022-11-23T02:55:39.5009524Z fi_getinfo: -61 2022-11-23T02:55:39.5009655Z fi_getinfo: -61 2022-11-23T02:55:39.5009734Z ok (12.058s) 2022-11-23T02:55:39.5009770Z 2022-11-23T02:55:39.5010010Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5010120Z Ran 1 test in 12.058s 2022-11-23T02:55:39.5010139Z 2022-11-23T02:55:39.5010227Z OK 2022-11-23T02:55:39.5010246Z 2022-11-23T02:55:39.5010369Z Generating XML reports... 2022-11-23T02:55:39.5010899Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023609.xml 2022-11-23T02:55:39.5011305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5011485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5011854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5012020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5012265Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwq812abt 2022-11-23T02:55:39.5012526Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwq812abt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5012545Z 2022-11-23T02:55:39.5012648Z Running tests... 2022-11-23T02:55:39.5012906Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5013250Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5013606Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5013820Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78827 2022-11-23T02:55:39.5014030Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78828 2022-11-23T02:55:39.5014220Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78829 2022-11-23T02:55:39.5014428Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78830 2022-11-23T02:55:39.5014789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5014960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5015321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5015509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5015867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5016036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5016376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5016559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5016909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5017078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5017436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5017621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5017973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5018140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5018503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5018666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5018911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6fk2cf0v 2022-11-23T02:55:39.5019172Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6fk2cf0v/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5019416Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpenck2ikf 2022-11-23T02:55:39.5019677Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpenck2ikf/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5019970Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpls92_b87 2022-11-23T02:55:39.5020230Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpls92_b87/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5020472Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwu5w67vx 2022-11-23T02:55:39.5020708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwu5w67vx/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5020929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5021143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5021365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5021583Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5021773Z fi_getinfo: -61 2022-11-23T02:55:39.5021905Z fi_getinfo: -61 2022-11-23T02:55:39.5022038Z fi_getinfo: -61 2022-11-23T02:55:39.5022154Z fi_getinfo: -61 2022-11-23T02:55:39.5022251Z ok (11.938s) 2022-11-23T02:55:39.5022270Z 2022-11-23T02:55:39.5022528Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5022638Z Ran 1 test in 11.938s 2022-11-23T02:55:39.5022656Z 2022-11-23T02:55:39.5022745Z OK 2022-11-23T02:55:39.5022763Z 2022-11-23T02:55:39.5022881Z Generating XML reports... 2022-11-23T02:55:39.5023600Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023623.xml 2022-11-23T02:55:39.5024157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5024320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5024701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5024898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5025153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplfzmod24 2022-11-23T02:55:39.5025420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplfzmod24/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5025440Z 2022-11-23T02:55:39.5025546Z Running tests... 2022-11-23T02:55:39.5025811Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5026167Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5026493Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5026694Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79334 2022-11-23T02:55:39.5026916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79335 2022-11-23T02:55:39.5027132Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79336 2022-11-23T02:55:39.5027346Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79337 2022-11-23T02:55:39.5027716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5027891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5028271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5028456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5028805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5028978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5029577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5029771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5030124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5030292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5030646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5030828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5031179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5031328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5031761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5031943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5032190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprhgk_1ik 2022-11-23T02:55:39.5032451Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprhgk_1ik/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5032694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnztypkq0 2022-11-23T02:55:39.5032953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnztypkq0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5033197Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6ldhjjxe 2022-11-23T02:55:39.5033436Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6ldhjjxe/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5033683Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptjn4tz_1 2022-11-23T02:55:39.5033943Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptjn4tz_1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5034166Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5034381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5034600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5034818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5034916Z ok (11.557s) 2022-11-23T02:55:39.5034935Z 2022-11-23T02:55:39.5035196Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5035290Z Ran 1 test in 11.557s 2022-11-23T02:55:39.5035308Z 2022-11-23T02:55:39.5035397Z OK 2022-11-23T02:55:39.5035415Z 2022-11-23T02:55:39.5035537Z Generating XML reports... 2022-11-23T02:55:39.5036061Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023638.xml 2022-11-23T02:55:39.5036429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5036760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5037138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5037326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5037580Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2h8rs_k8 2022-11-23T02:55:39.5037842Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2h8rs_k8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5037861Z 2022-11-23T02:55:39.5037970Z Running tests... 2022-11-23T02:55:39.5038293Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5038660Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5038974Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5039190Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79673 2022-11-23T02:55:39.5039404Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79674 2022-11-23T02:55:39.5039772Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79675 2022-11-23T02:55:39.5039978Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79676 2022-11-23T02:55:39.5040334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5040550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5040919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5041100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5041435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5041600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5041956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5042307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5042666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5042840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5043219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5043405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5043746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5043920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5044293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5044479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5044732Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeci3rfml 2022-11-23T02:55:39.5045163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeci3rfml/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5045415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9zcbevst 2022-11-23T02:55:39.5045669Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9zcbevst/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5045911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpygnd7yql 2022-11-23T02:55:39.5046333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpygnd7yql/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5046586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpru6xlmad 2022-11-23T02:55:39.5046850Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpru6xlmad/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5047078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5047299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5047521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5047793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5047900Z ok (10.010s) 2022-11-23T02:55:39.5047920Z 2022-11-23T02:55:39.5048188Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5048283Z Ran 1 test in 10.011s 2022-11-23T02:55:39.5048302Z 2022-11-23T02:55:39.5048391Z OK 2022-11-23T02:55:39.5048410Z 2022-11-23T02:55:39.5048532Z Generating XML reports... 2022-11-23T02:55:39.5049238Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023653.xml 2022-11-23T02:55:39.5049595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5049763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5050179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5050361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5050591Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgj5pexf2 2022-11-23T02:55:39.5050849Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgj5pexf2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5050868Z 2022-11-23T02:55:39.5050972Z Running tests... 2022-11-23T02:55:39.5051225Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5051571Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5051876Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5052086Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80132 2022-11-23T02:55:39.5052300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80133 2022-11-23T02:55:39.5052503Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80134 2022-11-23T02:55:39.5052692Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80135 2022-11-23T02:55:39.5053046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5053214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5053577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5053762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5054113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5054462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5054837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5055009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5055371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5055544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5055912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5056100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5056456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5056626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5057056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5057406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5057810Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6thde87b 2022-11-23T02:55:39.5058081Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6thde87b/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5058336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmnbbyv6c 2022-11-23T02:55:39.5058605Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmnbbyv6c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5058856Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp61vwf__t 2022-11-23T02:55:39.5059118Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp61vwf__t/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5059417Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqkgdrfw5 2022-11-23T02:55:39.5059683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqkgdrfw5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5059895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5060117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5060343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5060566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5060665Z ok (10.220s) 2022-11-23T02:55:39.5060685Z 2022-11-23T02:55:39.5060955Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5061068Z Ran 1 test in 10.220s 2022-11-23T02:55:39.5061087Z 2022-11-23T02:55:39.5061177Z OK 2022-11-23T02:55:39.5061200Z 2022-11-23T02:55:39.5061321Z Generating XML reports... 2022-11-23T02:55:39.5062015Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023706.xml 2022-11-23T02:55:39.5062374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5062543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5062903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5063085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5063542Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9x1__6w_ 2022-11-23T02:55:39.5063806Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9x1__6w_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5063829Z 2022-11-23T02:55:39.5064104Z Running tests... 2022-11-23T02:55:39.5064364Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5064721Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5065069Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5065282Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80467 2022-11-23T02:55:39.5065497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80468 2022-11-23T02:55:39.5065709Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80469 2022-11-23T02:55:39.5065921Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80470 2022-11-23T02:55:39.5066289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5066530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5066909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5067097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5067658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5067828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5068183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5068361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5068705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5068951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5069313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5069473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5069996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5070169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5070542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5070727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5070979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2mxxwyjc 2022-11-23T02:55:39.5071234Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphkhrn_1j 2022-11-23T02:55:39.5071511Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2mxxwyjc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5071754Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphkhrn_1j/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5072007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcnqefnv5 2022-11-23T02:55:39.5072271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcnqefnv5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5072522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplt7p9_rv 2022-11-23T02:55:39.5072785Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplt7p9_rv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5073013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5073233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5073463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5073686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5073770Z ok (10.104s) 2022-11-23T02:55:39.5073789Z 2022-11-23T02:55:39.5074058Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5074171Z Ran 1 test in 10.104s 2022-11-23T02:55:39.5074190Z 2022-11-23T02:55:39.5074279Z OK 2022-11-23T02:55:39.5074298Z 2022-11-23T02:55:39.5074421Z Generating XML reports... 2022-11-23T02:55:39.5075106Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023719.xml 2022-11-23T02:55:39.5075462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5075630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5076048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5076241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5076485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_iexh7ez 2022-11-23T02:55:39.5076742Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_iexh7ez/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5076761Z 2022-11-23T02:55:39.5076862Z Running tests... 2022-11-23T02:55:39.5077113Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5077463Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5077794Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5078052Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80930 2022-11-23T02:55:39.5078247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80931 2022-11-23T02:55:39.5078452Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80932 2022-11-23T02:55:39.5078656Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80933 2022-11-23T02:55:39.5079012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5079181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5079545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5079727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5080079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5080235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5080594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5080773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5081120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5081289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5081643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5081820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5082337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5082513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5082873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5083058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5083313Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxxaqtu2t 2022-11-23T02:55:39.5083584Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxxaqtu2t/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5083834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp25_5d5h5 2022-11-23T02:55:39.5084095Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp25_5d5h5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5084342Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_ifx_a8a 2022-11-23T02:55:39.5084599Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_ifx_a8a/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5084898Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx9i1c2ef 2022-11-23T02:55:39.5085152Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx9i1c2ef/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5085377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5085594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5085818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5086041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5086140Z ok (10.135s) 2022-11-23T02:55:39.5086160Z 2022-11-23T02:55:39.5086431Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5086540Z Ran 1 test in 10.135s 2022-11-23T02:55:39.5086601Z 2022-11-23T02:55:39.5086676Z OK 2022-11-23T02:55:39.5086695Z 2022-11-23T02:55:39.5086820Z Generating XML reports... 2022-11-23T02:55:39.5087373Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023731.xml 2022-11-23T02:55:39.5087740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5087913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5088288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5088631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5088873Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfm6ttl67 2022-11-23T02:55:39.5089298Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfm6ttl67/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5089322Z 2022-11-23T02:55:39.5089414Z Running tests... 2022-11-23T02:55:39.5089681Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5090036Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5090374Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5090591Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81269 2022-11-23T02:55:39.5090805Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81270 2022-11-23T02:55:39.5091020Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81271 2022-11-23T02:55:39.5091232Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81272 2022-11-23T02:55:39.5091584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5091763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5092298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5092481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5092833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5092998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5093351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5093529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5093872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5094073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5094611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5094798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5095155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5095326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5095699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5095882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5096135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_fhvg3np 2022-11-23T02:55:39.5096435Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_fhvg3np/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5096682Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4yo4pi5_ 2022-11-23T02:55:39.5096945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4yo4pi5_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5097195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqeeqljwd 2022-11-23T02:55:39.5097461Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqeeqljwd/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5097709Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprmi6fgf3 2022-11-23T02:55:39.5097971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprmi6fgf3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5098197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5098416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5098631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5099011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5099107Z ok (10.109s) 2022-11-23T02:55:39.5099126Z 2022-11-23T02:55:39.5099388Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5099497Z Ran 1 test in 10.109s 2022-11-23T02:55:39.5099515Z 2022-11-23T02:55:39.5099602Z OK 2022-11-23T02:55:39.5099620Z 2022-11-23T02:55:39.5099738Z Generating XML reports... 2022-11-23T02:55:39.5100266Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023744.xml 2022-11-23T02:55:39.5100607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5100780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5101143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5101326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5101568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvpbv9rwr 2022-11-23T02:55:39.5101827Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvpbv9rwr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5101845Z 2022-11-23T02:55:39.5101949Z Running tests... 2022-11-23T02:55:39.5102201Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5102546Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5102851Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5103124Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81728 2022-11-23T02:55:39.5103342Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81729 2022-11-23T02:55:39.5103547Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81730 2022-11-23T02:55:39.5103751Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81731 2022-11-23T02:55:39.5104300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5104472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5104839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5105021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5105538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5105788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5106168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5106353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5106710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5106880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5107249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5107435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5107775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5107948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5108471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5108652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5108898Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmh_60jv9 2022-11-23T02:55:39.5109157Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmh_60jv9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5109399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1mdx2jle 2022-11-23T02:55:39.5109655Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1mdx2jle/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5109899Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc96vxl0r 2022-11-23T02:55:39.5110141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc96vxl0r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5110382Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp68vmbowh 2022-11-23T02:55:39.5110635Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp68vmbowh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5110852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5111068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5111283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5111493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5111588Z ok (10.121s) 2022-11-23T02:55:39.5111607Z 2022-11-23T02:55:39.5111847Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5111959Z Ran 1 test in 10.121s 2022-11-23T02:55:39.5111977Z 2022-11-23T02:55:39.5112064Z OK 2022-11-23T02:55:39.5112180Z 2022-11-23T02:55:39.5112307Z Generating XML reports... 2022-11-23T02:55:39.5112835Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023757.xml 2022-11-23T02:55:39.5113190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5113358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5113721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5113902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5114128Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp92a_8qb1 2022-11-23T02:55:39.5114386Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp92a_8qb1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5114449Z 2022-11-23T02:55:39.5114557Z Running tests... 2022-11-23T02:55:39.5114809Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5115152Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5115449Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5115658Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82063 2022-11-23T02:55:39.5115865Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82064 2022-11-23T02:55:39.5116057Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82065 2022-11-23T02:55:39.5116262Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82066 2022-11-23T02:55:39.5116619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5116795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5117155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5117339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5117686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5117850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5118202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5118367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5118710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5118880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5119233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5119414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5119754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5119920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5120281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5120444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5120694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp964difxp 2022-11-23T02:55:39.5121003Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp964difxp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5121253Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4j22rcvs 2022-11-23T02:55:39.5121510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4j22rcvs/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5121749Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg4uv08zu 2022-11-23T02:55:39.5122003Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg4uv08zu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5122244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaci23k_s 2022-11-23T02:55:39.5122498Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaci23k_s/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5122701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5122966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5123186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5123684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5123784Z ok (4.451s) 2022-11-23T02:55:39.5123803Z 2022-11-23T02:55:39.5124073Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5124185Z Ran 1 test in 4.452s 2022-11-23T02:55:39.5124204Z 2022-11-23T02:55:39.5124294Z OK 2022-11-23T02:55:39.5124314Z 2022-11-23T02:55:39.5124419Z Generating XML reports... 2022-11-23T02:55:39.5124967Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023810.xml 2022-11-23T02:55:39.5125334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5125514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5125894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5126084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5126334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe_nmrzb3 2022-11-23T02:55:39.5126599Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe_nmrzb3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5126618Z 2022-11-23T02:55:39.5126723Z Running tests... 2022-11-23T02:55:39.5126966Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5127322Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5127627Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5127849Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82394 2022-11-23T02:55:39.5128064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82395 2022-11-23T02:55:39.5128274Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82396 2022-11-23T02:55:39.5128483Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82397 2022-11-23T02:55:39.5128853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5129027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5129375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5129706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5130069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5130306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5130679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5130862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5131210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5131375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5131714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5131896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5132245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5132463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5132819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5132998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5133247Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgz2lnqon 2022-11-23T02:55:39.5133509Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgz2lnqon/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5133756Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppdcurp10 2022-11-23T02:55:39.5133997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppdcurp10/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5134241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4_8ldf4x 2022-11-23T02:55:39.5134503Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4_8ldf4x/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5134742Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ovlsd77 2022-11-23T02:55:39.5134995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ovlsd77/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5135213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5135432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5135648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5135846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5135947Z ok (4.623s) 2022-11-23T02:55:39.5135966Z 2022-11-23T02:55:39.5136226Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5136337Z Ran 1 test in 4.623s 2022-11-23T02:55:39.5136356Z 2022-11-23T02:55:39.5136442Z OK 2022-11-23T02:55:39.5136463Z 2022-11-23T02:55:39.5136582Z Generating XML reports... 2022-11-23T02:55:39.5137288Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023818.xml 2022-11-23T02:55:39.5137656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5137828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5138187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5138375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5138628Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp999p1g8r 2022-11-23T02:55:39.5138895Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp999p1g8r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5138963Z 2022-11-23T02:55:39.5139077Z Running tests... 2022-11-23T02:55:39.5139340Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5139699Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5140160Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5140352Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82725 2022-11-23T02:55:39.5140562Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82726 2022-11-23T02:55:39.5140768Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82727 2022-11-23T02:55:39.5140971Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82728 2022-11-23T02:55:39.5141327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5141548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5141914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5142098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5142448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5142782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5143153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5143343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5143700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5144066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5144453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5144639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5144997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5145150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5145514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5145699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5145951Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp15fs24ed 2022-11-23T02:55:39.5146226Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp15fs24ed/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5146478Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnqpsy5d_ 2022-11-23T02:55:39.5146743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnqpsy5d_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5146992Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy5zjlvkj 2022-11-23T02:55:39.5147252Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy5zjlvkj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5147482Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbt25ave9 2022-11-23T02:55:39.5147743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbt25ave9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5147971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5148198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5148487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5148716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5148817Z ok (4.545s) 2022-11-23T02:55:39.5148836Z 2022-11-23T02:55:39.5149103Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5149197Z Ran 1 test in 4.545s 2022-11-23T02:55:39.5149235Z 2022-11-23T02:55:39.5149308Z OK 2022-11-23T02:55:39.5149327Z 2022-11-23T02:55:39.5149448Z Generating XML reports... 2022-11-23T02:55:39.5150133Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023825.xml 2022-11-23T02:55:39.5150485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5150713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5151083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5151265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5151509Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp026iqv07 2022-11-23T02:55:39.5151912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp026iqv07/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5151953Z 2022-11-23T02:55:39.5152043Z Running tests... 2022-11-23T02:55:39.5152306Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5152659Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5152967Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5153192Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83056 2022-11-23T02:55:39.5153408Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83057 2022-11-23T02:55:39.5153622Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83058 2022-11-23T02:55:39.5153815Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83059 2022-11-23T02:55:39.5154182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5154356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5154728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5154918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5155275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5155453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5155824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5156015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5156353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5156522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5156887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5157075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5157576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5157788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5158331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5158517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5158757Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3fl43t2k 2022-11-23T02:55:39.5159023Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3fl43t2k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5159275Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpewb_fp25 2022-11-23T02:55:39.5159539Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpewb_fp25/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5159791Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq8oqii4k 2022-11-23T02:55:39.5160123Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq8oqii4k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5160373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkfe3k1tq 2022-11-23T02:55:39.5160637Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkfe3k1tq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5160861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5161069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5161293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5161510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5161606Z ok (4.561s) 2022-11-23T02:55:39.5161625Z 2022-11-23T02:55:39.5161891Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5162006Z Ran 1 test in 4.561s 2022-11-23T02:55:39.5162025Z 2022-11-23T02:55:39.5162115Z OK 2022-11-23T02:55:39.5162137Z 2022-11-23T02:55:39.5162257Z Generating XML reports... 2022-11-23T02:55:39.5162803Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023833.xml 2022-11-23T02:55:39.5163151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5163521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5164066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5164258Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5164510Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgxpa2xsg 2022-11-23T02:55:39.5164780Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgxpa2xsg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5164806Z 2022-11-23T02:55:39.5164912Z Running tests... 2022-11-23T02:55:39.5165175Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5165514Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5165832Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5166045Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83387 2022-11-23T02:55:39.5166259Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83388 2022-11-23T02:55:39.5166469Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83389 2022-11-23T02:55:39.5170307Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83390 2022-11-23T02:55:39.5170798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5170982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5171364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5171538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5171900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5172071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5172442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5172631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5172990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5173219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5173745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5173925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5174260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5174426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5174779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5174958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5175206Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv83jjrs8 2022-11-23T02:55:39.5175478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv83jjrs8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5175727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulhq4uuh 2022-11-23T02:55:39.5175987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulhq4uuh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5176210Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoc1wlsiu 2022-11-23T02:55:39.5176471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoc1wlsiu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5176715Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgtl9gpn9 2022-11-23T02:55:39.5176971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgtl9gpn9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5177195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5177420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5177643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5177865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5177969Z ok (6.293s) 2022-11-23T02:55:39.5177990Z 2022-11-23T02:55:39.5178235Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5178350Z Ran 1 test in 6.293s 2022-11-23T02:55:39.5178369Z 2022-11-23T02:55:39.5178456Z OK 2022-11-23T02:55:39.5178475Z 2022-11-23T02:55:39.5178594Z Generating XML reports... 2022-11-23T02:55:39.5179123Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023840.xml 2022-11-23T02:55:39.5179480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5179653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5180063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5180236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5180482Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprkrby0fp 2022-11-23T02:55:39.5180742Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprkrby0fp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5180761Z 2022-11-23T02:55:39.5180865Z Running tests... 2022-11-23T02:55:39.5181122Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5181469Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5181767Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5182027Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83722 2022-11-23T02:55:39.5182234Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83723 2022-11-23T02:55:39.5182601Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83724 2022-11-23T02:55:39.5182813Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83725 2022-11-23T02:55:39.5183186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5183361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5183739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5184230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5184624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5184807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5185162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5185355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5185715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5185884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5186248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5186432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5186790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5186969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5187332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5187500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5187752Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptj0_ee8v 2022-11-23T02:55:39.5188018Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptj0_ee8v/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5188273Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphryw06xv 2022-11-23T02:55:39.5188538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphryw06xv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5188946Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptgcgs8_u 2022-11-23T02:55:39.5189293Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptgcgs8_u/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5189717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjk1b8j9o 2022-11-23T02:55:39.5189979Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjk1b8j9o/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5190191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5190416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5190639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5190857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5190951Z ok (6.220s) 2022-11-23T02:55:39.5190971Z 2022-11-23T02:55:39.5191243Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5191414Z Ran 1 test in 6.220s 2022-11-23T02:55:39.5191433Z 2022-11-23T02:55:39.5191524Z OK 2022-11-23T02:55:39.5191547Z 2022-11-23T02:55:39.5191654Z Generating XML reports... 2022-11-23T02:55:39.5192201Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023849.xml 2022-11-23T02:55:39.5192573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5192745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5193122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5193312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5193565Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvwxpur9n 2022-11-23T02:55:39.5193836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvwxpur9n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5193862Z 2022-11-23T02:55:39.5193969Z Running tests... 2022-11-23T02:55:39.5194217Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5194573Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5194886Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5195104Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84057 2022-11-23T02:55:39.5195319Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84058 2022-11-23T02:55:39.5195533Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84059 2022-11-23T02:55:39.5195741Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84060 2022-11-23T02:55:39.5196123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5196283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5196662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5196849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5197208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5197379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5197749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5197937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5198294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5198507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5198868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5199043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5199388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5199549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5199914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5200093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5200340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltb6lch4 2022-11-23T02:55:39.5200652Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltb6lch4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5201047Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp075t8mw4 2022-11-23T02:55:39.5201294Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp075t8mw4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5201528Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprxp0lec9 2022-11-23T02:55:39.5201773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprxp0lec9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5202003Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz9f169wh 2022-11-23T02:55:39.5202242Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz9f169wh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5202450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5202660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5202869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5203069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5203156Z ok (11.451s) 2022-11-23T02:55:39.5203175Z 2022-11-23T02:55:39.5203425Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5203522Z Ran 1 test in 11.451s 2022-11-23T02:55:39.5203541Z 2022-11-23T02:55:39.5203618Z OK 2022-11-23T02:55:39.5203637Z 2022-11-23T02:55:39.5203744Z Generating XML reports... 2022-11-23T02:55:39.5204266Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023858.xml 2022-11-23T02:55:39.5204616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5204772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5205129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5205304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5205712Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6x20gdii 2022-11-23T02:55:39.5205970Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6x20gdii/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5205990Z 2022-11-23T02:55:39.5206087Z Running tests... 2022-11-23T02:55:39.5206342Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5206689Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5206968Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5207758Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79750 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.694s) 2022-11-23T02:55:39.5207789Z 2022-11-23T02:55:39.5208042Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5208142Z Ran 1 test in 1.694s 2022-11-23T02:55:39.5208162Z 2022-11-23T02:55:39.5208255Z OK (skipped=1) 2022-11-23T02:55:39.5208274Z 2022-11-23T02:55:39.5208386Z Generating XML reports... 2022-11-23T02:55:39.5209071Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023913.xml 2022-11-23T02:55:39.5209589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5209805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5210179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5210361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5210600Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1nph74ct 2022-11-23T02:55:39.5210865Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1nph74ct/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5210886Z 2022-11-23T02:55:39.5210984Z Running tests... 2022-11-23T02:55:39.5211238Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5211583Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5211867Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5212083Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84462 2022-11-23T02:55:39.5212289Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84463 2022-11-23T02:55:39.5212486Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84464 2022-11-23T02:55:39.5212689Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84465 2022-11-23T02:55:39.5213048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5213215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5213744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5213917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5214256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5214423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5214773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5214937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5215277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5215432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5215774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5215942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5216276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5216484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5216841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5217001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5217235Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4wi6ccvc 2022-11-23T02:55:39.5217664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4wi6ccvc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5217905Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp19jyq4cl 2022-11-23T02:55:39.5218162Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp19jyq4cl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5218401Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0_3_1mrs 2022-11-23T02:55:39.5218704Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0_3_1mrs/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5218948Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqqvo5vel 2022-11-23T02:55:39.5219199Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqqvo5vel/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5219411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5219626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5219838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5220050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5220189Z fi_getinfo: -61 2022-11-23T02:55:39.5220314Z fi_getinfo: -61 2022-11-23T02:55:39.5220436Z fi_getinfo: -61 2022-11-23T02:55:39.5220551Z fi_getinfo: -61 2022-11-23T02:55:39.5220642Z ok (20.049s) 2022-11-23T02:55:39.5220662Z 2022-11-23T02:55:39.5221077Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5221176Z Ran 1 test in 20.049s 2022-11-23T02:55:39.5221194Z 2022-11-23T02:55:39.5221271Z OK 2022-11-23T02:55:39.5221289Z 2022-11-23T02:55:39.5221397Z Generating XML reports... 2022-11-23T02:55:39.5221910Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023917.xml 2022-11-23T02:55:39.5222256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5222413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5222934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5223112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5223364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd81h6a_p 2022-11-23T02:55:39.5223624Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd81h6a_p/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5223644Z 2022-11-23T02:55:39.5223747Z Running tests... 2022-11-23T02:55:39.5224206Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5224562Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5224850Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5225049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84981 2022-11-23T02:55:39.5225254Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84982 2022-11-23T02:55:39.5225456Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84983 2022-11-23T02:55:39.5225739Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84984 2022-11-23T02:55:39.5226114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5226278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5226644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5226822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5227174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5227328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5227701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5227936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5228289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5228451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5228800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5228961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5229321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5229492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5229857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5230195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5230441Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkazotnrz 2022-11-23T02:55:39.5230692Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkazotnrz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5230925Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpey3p2ev7 2022-11-23T02:55:39.5231170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpey3p2ev7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5231400Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2y_511f7 2022-11-23T02:55:39.5231628Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpox8357yw 2022-11-23T02:55:39.5231859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2y_511f7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5232278Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpox8357yw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5232506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5232724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5232941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5233157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5233299Z fi_getinfo: -61 2022-11-23T02:55:39.5233428Z fi_getinfo: -61 2022-11-23T02:55:39.5233545Z fi_getinfo: -61 2022-11-23T02:55:39.5233667Z fi_getinfo: -61 2022-11-23T02:55:39.5233757Z ok (13.560s) 2022-11-23T02:55:39.5233776Z 2022-11-23T02:55:39.5234030Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5234133Z Ran 1 test in 13.561s 2022-11-23T02:55:39.5234152Z 2022-11-23T02:55:39.5234233Z OK 2022-11-23T02:55:39.5234253Z 2022-11-23T02:55:39.5234371Z Generating XML reports... 2022-11-23T02:55:39.5234944Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023940.xml 2022-11-23T02:55:39.5235459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5235616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5235967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5236141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5236375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv23jmcpy 2022-11-23T02:55:39.5236620Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv23jmcpy/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5236639Z 2022-11-23T02:55:39.5236778Z Running tests... 2022-11-23T02:55:39.5237027Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5237534Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5237831Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5238039Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85500 2022-11-23T02:55:39.5238244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85501 2022-11-23T02:55:39.5238446Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85502 2022-11-23T02:55:39.5238646Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85503 2022-11-23T02:55:39.5239005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5239169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5239543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5239716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5240065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5240383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5240731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5240901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5241236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5241391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5241738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5241888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5242238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5242410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5242759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5243107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5243355Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnu0ucos4 2022-11-23T02:55:39.5243621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnu0ucos4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5243866Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpewd1ua73 2022-11-23T02:55:39.5244165Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpewd1ua73/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5244408Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplrxsh2k1 2022-11-23T02:55:39.5244661Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplrxsh2k1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5244898Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq2t7jxt0 2022-11-23T02:55:39.5245151Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq2t7jxt0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5245367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5245586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5245959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5246259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5246384Z fi_getinfo: -61 2022-11-23T02:55:39.5246505Z fi_getinfo: -61 2022-11-23T02:55:39.5246622Z fi_getinfo: -61 2022-11-23T02:55:39.5246740Z fi_getinfo: -61 2022-11-23T02:55:39.5246999Z ok (11.935s) 2022-11-23T02:55:39.5247019Z 2022-11-23T02:55:39.5247272Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5247373Z Ran 1 test in 11.935s 2022-11-23T02:55:39.5247392Z 2022-11-23T02:55:39.5247465Z OK 2022-11-23T02:55:39.5247492Z 2022-11-23T02:55:39.5247596Z Generating XML reports... 2022-11-23T02:55:39.5248125Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023956.xml 2022-11-23T02:55:39.5248478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5248647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5249012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5249190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5249438Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt90h7hbq 2022-11-23T02:55:39.5249855Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt90h7hbq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5249874Z 2022-11-23T02:55:39.5249961Z Running tests... 2022-11-23T02:55:39.5250207Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5250542Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5250812Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5251019Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86014 2022-11-23T02:55:39.5251217Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86015 2022-11-23T02:55:39.5251409Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86016 2022-11-23T02:55:39.5251602Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86017 2022-11-23T02:55:39.5251950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5252104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5252457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5252628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5252965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5253184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5253551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5253725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5254062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5254210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5254556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5254905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5255262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5255478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5255841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5256017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5256262Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps61s5r9e 2022-11-23T02:55:39.5256519Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps61s5r9e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5256756Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0obghdpl 2022-11-23T02:55:39.5257013Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0obghdpl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5257250Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpptkxtbxe 2022-11-23T02:55:39.5257511Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpptkxtbxe/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5257901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr0yebihx 2022-11-23T02:55:39.5258143Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr0yebihx/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5258533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5258748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5258957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5259165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5259300Z fi_getinfo: -61 2022-11-23T02:55:39.5259423Z fi_getinfo: -61 2022-11-23T02:55:39.5259546Z fi_getinfo: -61 2022-11-23T02:55:39.5259675Z fi_getinfo: -61 2022-11-23T02:55:39.5259763Z ok (5.131s) 2022-11-23T02:55:39.5259782Z 2022-11-23T02:55:39.5260038Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5260133Z Ran 1 test in 5.132s 2022-11-23T02:55:39.5260152Z 2022-11-23T02:55:39.5260233Z OK 2022-11-23T02:55:39.5260252Z 2022-11-23T02:55:39.5260365Z Generating XML reports... 2022-11-23T02:55:39.5260905Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024011.xml 2022-11-23T02:55:39.5261266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5261433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5261802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5261981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5262419Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpihv7s_t0 2022-11-23T02:55:39.5262680Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpihv7s_t0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5262699Z 2022-11-23T02:55:39.5262793Z Running tests... 2022-11-23T02:55:39.5263042Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5263376Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5263714Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5264261Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86517 2022-11-23T02:55:39.5264476Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86518 2022-11-23T02:55:39.5264678Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86519 2022-11-23T02:55:39.5264955Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86520 2022-11-23T02:55:39.5265327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5265493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5265861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5266039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5266392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5266554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5266915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5267092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5267450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5267614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5268173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5268342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5268678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5268832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5269172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5269342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5269574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxyaof_nk 2022-11-23T02:55:39.5269823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxyaof_nk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5270055Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_gp5w06_ 2022-11-23T02:55:39.5270301Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_gp5w06_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5270533Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj4_ww9ql 2022-11-23T02:55:39.5270774Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj4_ww9ql/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5271182Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5edqu0q 2022-11-23T02:55:39.5271432Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5edqu0q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5271709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5271935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5272149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5272363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5272498Z fi_getinfo: -61 2022-11-23T02:55:39.5272620Z fi_getinfo: -61 2022-11-23T02:55:39.5272741Z fi_getinfo: -61 2022-11-23T02:55:39.5272863Z fi_getinfo: -61 2022-11-23T02:55:39.5272944Z ok (8.080s) 2022-11-23T02:55:39.5272964Z 2022-11-23T02:55:39.5273219Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5273319Z Ran 1 test in 8.080s 2022-11-23T02:55:39.5273338Z 2022-11-23T02:55:39.5273417Z OK 2022-11-23T02:55:39.5273480Z 2022-11-23T02:55:39.5273595Z Generating XML reports... 2022-11-23T02:55:39.5274285Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024019.xml 2022-11-23T02:55:39.5274634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5274794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5275141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5275315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5275552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb3mv3pge 2022-11-23T02:55:39.5275799Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb3mv3pge/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5275818Z 2022-11-23T02:55:39.5275917Z Running tests... 2022-11-23T02:55:39.5276163Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5276498Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5276792Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5276989Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87028 2022-11-23T02:55:39.5277179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87029 2022-11-23T02:55:39.5277375Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87030 2022-11-23T02:55:39.5277571Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87031 2022-11-23T02:55:39.5277917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5278079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5278436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5278607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5278944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5279095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5279440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5279612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5279949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5280108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5280508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5280686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5281031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5281184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5281528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5281699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5281935Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkxglv2s0 2022-11-23T02:55:39.5282184Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkxglv2s0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5282474Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdkotdgdv 2022-11-23T02:55:39.5282894Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdkotdgdv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5283137Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxpxjvz1q 2022-11-23T02:55:39.5283396Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxpxjvz1q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5283628Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ldv9u72 2022-11-23T02:55:39.5283879Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ldv9u72/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5284096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5284310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5284528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5284744Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5284880Z fi_getinfo: -61 2022-11-23T02:55:39.5285004Z fi_getinfo: -61 2022-11-23T02:55:39.5285119Z fi_getinfo: -61 2022-11-23T02:55:39.5285242Z fi_getinfo: -61 2022-11-23T02:55:39.5285329Z ok (8.035s) 2022-11-23T02:55:39.5285349Z 2022-11-23T02:55:39.5285602Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5285860Z Ran 1 test in 8.036s 2022-11-23T02:55:39.5285878Z 2022-11-23T02:55:39.5285954Z OK 2022-11-23T02:55:39.5285972Z 2022-11-23T02:55:39.5286080Z Generating XML reports... 2022-11-23T02:55:39.5286781Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024030.xml 2022-11-23T02:55:39.5287131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5287302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5287668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5287846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5288088Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa1hq0c6f 2022-11-23T02:55:39.5288347Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa1hq0c6f/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5288367Z 2022-11-23T02:55:39.5288463Z Running tests... 2022-11-23T02:55:39.5288713Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5289057Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5289491Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5289753Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87539 2022-11-23T02:55:39.5290131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87540 2022-11-23T02:55:39.5290335Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87541 2022-11-23T02:55:39.5290532Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87542 2022-11-23T02:55:39.5290892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5291056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5291420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5291593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5291996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5292158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5292518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5292695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5293196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5293351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5293693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5293863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5294198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5294361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5294704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5294871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5295105Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltvx384h 2022-11-23T02:55:39.5295353Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltvx384h/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5295588Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzqf2po05 2022-11-23T02:55:39.5296014Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzqf2po05/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5296248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5v3v2fdu 2022-11-23T02:55:39.5296510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5v3v2fdu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5296750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfbcy9hlw 2022-11-23T02:55:39.5297010Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfbcy9hlw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5297227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5297443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5297658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5297870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5298004Z fi_getinfo: -61 2022-11-23T02:55:39.5298123Z fi_getinfo: -61 2022-11-23T02:55:39.5298248Z fi_getinfo: -61 2022-11-23T02:55:39.5298375Z fi_getinfo: -61 2022-11-23T02:55:39.5298463Z ok (8.147s) 2022-11-23T02:55:39.5298528Z 2022-11-23T02:55:39.5298791Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5298895Z Ran 1 test in 8.147s 2022-11-23T02:55:39.5298914Z 2022-11-23T02:55:39.5298993Z OK 2022-11-23T02:55:39.5299012Z 2022-11-23T02:55:39.5299117Z Generating XML reports... 2022-11-23T02:55:39.5299801Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024041.xml 2022-11-23T02:55:39.5300144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5300301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5300653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5300879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5301155Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm782j4i8 2022-11-23T02:55:39.5301450Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm782j4i8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5301470Z 2022-11-23T02:55:39.5301610Z Running tests... 2022-11-23T02:55:39.5301852Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5302237Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5302588Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5303331Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/80008 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.670s) 2022-11-23T02:55:39.5303499Z 2022-11-23T02:55:39.5303748Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5304280Z Ran 1 test in 1.670s 2022-11-23T02:55:39.5304303Z 2022-11-23T02:55:39.5304459Z OK (skipped=1) 2022-11-23T02:55:39.5304479Z 2022-11-23T02:55:39.5304638Z Generating XML reports... 2022-11-23T02:55:39.5305261Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024052.xml 2022-11-23T02:55:39.5305668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5305880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5306307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5306590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5306840Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvqojp5w1 2022-11-23T02:55:39.5307149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvqojp5w1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5307170Z 2022-11-23T02:55:39.5307313Z Running tests... 2022-11-23T02:55:39.5307618Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5308014Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5308353Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5308620Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88112 2022-11-23T02:55:39.5308872Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88113 2022-11-23T02:55:39.5309074Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88114 2022-11-23T02:55:39.5309437Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88115 2022-11-23T02:55:39.5309867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5310080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5310527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5310765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5311321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5311526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5311930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5312161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5312590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5312794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5313191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5313421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5313803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5314008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5314409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5314582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5314870Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxbdn8jbs 2022-11-23T02:55:39.5315202Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxbdn8jbs/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5315483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4t8hlry6 2022-11-23T02:55:39.5315819Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4t8hlry6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5316096Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp50suikg2 2022-11-23T02:55:39.5316385Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp50suikg2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5316659Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz0mjb7aq 2022-11-23T02:55:39.5316952Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz0mjb7aq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5317160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5317413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5317700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5317962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5318142Z fi_getinfo: -61 2022-11-23T02:55:39.5318307Z fi_getinfo: -61 2022-11-23T02:55:39.5318471Z fi_getinfo: -61 2022-11-23T02:55:39.5318583Z fi_getinfo: -61 2022-11-23T02:55:39.5318716Z ok (10.828s) 2022-11-23T02:55:39.5318735Z 2022-11-23T02:55:39.5319028Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5319170Z Ran 1 test in 10.828s 2022-11-23T02:55:39.5319191Z 2022-11-23T02:55:39.5319357Z OK 2022-11-23T02:55:39.5319377Z 2022-11-23T02:55:39.5319565Z Generating XML reports... 2022-11-23T02:55:39.5320182Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024057.xml 2022-11-23T02:55:39.5320585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5320740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5321145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5321366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5321653Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcfy1fz07 2022-11-23T02:55:39.5321946Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcfy1fz07/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5322008Z 2022-11-23T02:55:39.5322190Z Running tests... 2022-11-23T02:55:39.5322490Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5322881Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5323206Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5323400Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88623 2022-11-23T02:55:39.5323646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88624 2022-11-23T02:55:39.5323899Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88625 2022-11-23T02:55:39.5324139Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88626 2022-11-23T02:55:39.5324731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5325019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5325440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5325665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5326014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5326225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5326642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5326867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5327260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5327472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5327919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5328148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5328552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5328706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5329118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5349081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5349391Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb_kd_iaw 2022-11-23T02:55:39.5349657Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb_kd_iaw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5350039Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfe44611n 2022-11-23T02:55:39.5350469Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfe44611n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5350705Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfzv6n6bd 2022-11-23T02:55:39.5350947Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfzv6n6bd/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5351176Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6940gm1u 2022-11-23T02:55:39.5351419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6940gm1u/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5351630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5351843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5352119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5352325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5352490Z fi_getinfo: -61 2022-11-23T02:55:39.5352606Z fi_getinfo: -61 2022-11-23T02:55:39.5352725Z fi_getinfo: -61 2022-11-23T02:55:39.5352842Z fi_getinfo: -61 2022-11-23T02:55:39.5352927Z ok (10.549s) 2022-11-23T02:55:39.5352948Z 2022-11-23T02:55:39.5353206Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5353306Z Ran 1 test in 10.549s 2022-11-23T02:55:39.5353325Z 2022-11-23T02:55:39.5353400Z OK 2022-11-23T02:55:39.5353419Z 2022-11-23T02:55:39.5353521Z Generating XML reports... 2022-11-23T02:55:39.5354046Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024110.xml 2022-11-23T02:55:39.5354396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5354562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5354921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5355272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5355515Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulb9r58v 2022-11-23T02:55:39.5355771Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulb9r58v/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5355791Z 2022-11-23T02:55:39.5355891Z Running tests... 2022-11-23T02:55:39.5356140Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5356488Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5356789Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5356997Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89134 2022-11-23T02:55:39.5357202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89135 2022-11-23T02:55:39.5357404Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89136 2022-11-23T02:55:39.5357601Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89137 2022-11-23T02:55:39.5358121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5358280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5358631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5358979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5359388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5359555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5359917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5360094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5360441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5360601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5360952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5361131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5361542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5361701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5362056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5362232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5362477Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2qp59snz 2022-11-23T02:55:39.5362735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2qp59snz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5362978Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7hv7tjhz 2022-11-23T02:55:39.5363227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7hv7tjhz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5363670Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgu9wxfxr 2022-11-23T02:55:39.5363929Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgu9wxfxr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5364161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp08mrykq5 2022-11-23T02:55:39.5364401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp08mrykq5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5364784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5364999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5365214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5365418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5365565Z fi_getinfo: -61 2022-11-23T02:55:39.5365693Z fi_getinfo: -61 2022-11-23T02:55:39.5365819Z fi_getinfo: -61 2022-11-23T02:55:39.5365940Z fi_getinfo: -61 2022-11-23T02:55:39.5366030Z ok (10.521s) 2022-11-23T02:55:39.5366050Z 2022-11-23T02:55:39.5366306Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5366403Z Ran 1 test in 10.521s 2022-11-23T02:55:39.5366429Z 2022-11-23T02:55:39.5366502Z OK 2022-11-23T02:55:39.5366521Z 2022-11-23T02:55:39.5366640Z Generating XML reports... 2022-11-23T02:55:39.5367178Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024124.xml 2022-11-23T02:55:39.5367536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5367759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5368127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5368522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5368766Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4uj_pcdi 2022-11-23T02:55:39.5369008Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4uj_pcdi/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5369034Z 2022-11-23T02:55:39.5369120Z Running tests... 2022-11-23T02:55:39.5369362Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5369695Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5369976Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5370174Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89645 2022-11-23T02:55:39.5370372Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89646 2022-11-23T02:55:39.5370621Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89647 2022-11-23T02:55:39.5370809Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89648 2022-11-23T02:55:39.5371160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5371492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5371864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5372043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5372391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5372551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5372915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5373101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5373452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5373615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5373973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5374307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5374648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5374802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5375145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5375320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5375557Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplp63da3n 2022-11-23T02:55:39.5375801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplp63da3n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5376032Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp74jr5wft 2022-11-23T02:55:39.5376276Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp74jr5wft/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5376506Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpluk35ua3 2022-11-23T02:55:39.5376747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpluk35ua3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5376974Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb93x6tuq 2022-11-23T02:55:39.5377295Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb93x6tuq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5377514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5377715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5377923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5378127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5378257Z fi_getinfo: -61 2022-11-23T02:55:39.5378377Z fi_getinfo: -61 2022-11-23T02:55:39.5378494Z fi_getinfo: -61 2022-11-23T02:55:39.5378611Z fi_getinfo: -61 2022-11-23T02:55:39.5378688Z ok (10.646s) 2022-11-23T02:55:39.5378712Z 2022-11-23T02:55:39.5378950Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5379047Z Ran 1 test in 10.646s 2022-11-23T02:55:39.5379110Z 2022-11-23T02:55:39.5379188Z OK 2022-11-23T02:55:39.5379206Z 2022-11-23T02:55:39.5379317Z Generating XML reports... 2022-11-23T02:55:39.5379836Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024137.xml 2022-11-23T02:55:39.5380182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5380342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5380698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5380863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5381100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphm9r1xll 2022-11-23T02:55:39.5381347Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphm9r1xll/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5381370Z 2022-11-23T02:55:39.5381462Z Running tests... 2022-11-23T02:55:39.5381706Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5382041Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5382323Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5382524Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90156 2022-11-23T02:55:39.5382725Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90157 2022-11-23T02:55:39.5382915Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90158 2022-11-23T02:55:39.5383277Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90159 2022-11-23T02:55:39.5383641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5383816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5384437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5384617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5384969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5385130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5385482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5385658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5386167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5386327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5386919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5387106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5387459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5387623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5387987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5388159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5388405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyx179zi4 2022-11-23T02:55:39.5388664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyx179zi4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5388972Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb43nk44j 2022-11-23T02:55:39.5389227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb43nk44j/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5389620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3v_erqxl 2022-11-23T02:55:39.5389864Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3v_erqxl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5390092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt6u3987r 2022-11-23T02:55:39.5390495Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt6u3987r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5390714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5390929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5391149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5391361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5391498Z fi_getinfo: -61 2022-11-23T02:55:39.5391622Z fi_getinfo: -61 2022-11-23T02:55:39.5391744Z fi_getinfo: -61 2022-11-23T02:55:39.5391859Z fi_getinfo: -61 2022-11-23T02:55:39.5391948Z ok (10.551s) 2022-11-23T02:55:39.5391968Z 2022-11-23T02:55:39.5392222Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5392324Z Ran 1 test in 10.551s 2022-11-23T02:55:39.5392344Z 2022-11-23T02:55:39.5392423Z OK 2022-11-23T02:55:39.5392442Z 2022-11-23T02:55:39.5392556Z Generating XML reports... 2022-11-23T02:55:39.5393092Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024150.xml 2022-11-23T02:55:39.5393609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5393761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5394111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5394284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5394516Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9mu_qjvu 2022-11-23T02:55:39.5394761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9mu_qjvu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5394780Z 2022-11-23T02:55:39.5394873Z Running tests... 2022-11-23T02:55:39.5395123Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5395457Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5395786Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5395985Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90675 2022-11-23T02:55:39.5396184Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90676 2022-11-23T02:55:39.5396379Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90677 2022-11-23T02:55:39.5396749Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90678 2022-11-23T02:55:39.5397112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5397274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5397636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5397815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5398225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5398385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5398745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5398922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5399268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5399590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5399937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5400107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5400454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5400604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5400949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5401116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5401352Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiq27gc2o 2022-11-23T02:55:39.5401603Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiq27gc2o/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5401837Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnk7arrl8 2022-11-23T02:55:39.5402085Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnk7arrl8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5402327Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyej6q628 2022-11-23T02:55:39.5402569Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyej6q628/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5402802Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi698n8uf 2022-11-23T02:55:39.5403042Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi698n8uf/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5403248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5403463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5403668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5403875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5404006Z fi_getinfo: -61 2022-11-23T02:55:39.5404130Z fi_getinfo: -61 2022-11-23T02:55:39.5404255Z fi_getinfo: -61 2022-11-23T02:55:39.5404417Z fi_getinfo: -61 2022-11-23T02:55:39.5404507Z ok (10.632s) 2022-11-23T02:55:39.5404525Z 2022-11-23T02:55:39.5404773Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5404871Z Ran 1 test in 10.632s 2022-11-23T02:55:39.5404889Z 2022-11-23T02:55:39.5404967Z OK 2022-11-23T02:55:39.5404985Z 2022-11-23T02:55:39.5405094Z Generating XML reports... 2022-11-23T02:55:39.5405604Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024204.xml 2022-11-23T02:55:39.5405953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5406112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5406650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5406882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5407125Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdbi8luyy 2022-11-23T02:55:39.5407381Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdbi8luyy/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5407401Z 2022-11-23T02:55:39.5407498Z Running tests... 2022-11-23T02:55:39.5407748Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5408095Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5408386Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5408593Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91194 2022-11-23T02:55:39.5408797Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91195 2022-11-23T02:55:39.5409005Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91196 2022-11-23T02:55:39.5409207Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91197 2022-11-23T02:55:39.5409719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5409878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5410226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5410398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5410735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5410890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5411238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5411414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5411753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5411906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5412241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5412411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5412752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5412904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5413245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5413460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5413705Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2rntucpu 2022-11-23T02:55:39.5413956Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2rntucpu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5414189Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl1jlu618 2022-11-23T02:55:39.5414429Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl1jlu618/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5414664Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgc7tmj0z 2022-11-23T02:55:39.5414910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgc7tmj0z/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5415141Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_rv87r3_ 2022-11-23T02:55:39.5415429Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_rv87r3_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5415637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5415844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5416050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5416255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5416380Z fi_getinfo: -61 2022-11-23T02:55:39.5416498Z fi_getinfo: -61 2022-11-23T02:55:39.5416617Z fi_getinfo: -61 2022-11-23T02:55:39.5416735Z fi_getinfo: -61 2022-11-23T02:55:39.5416821Z ok (10.745s) 2022-11-23T02:55:39.5416839Z 2022-11-23T02:55:39.5417086Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5417183Z Ran 1 test in 10.746s 2022-11-23T02:55:39.5417206Z 2022-11-23T02:55:39.5417276Z OK 2022-11-23T02:55:39.5417294Z 2022-11-23T02:55:39.5417406Z Generating XML reports... 2022-11-23T02:55:39.5417923Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024217.xml 2022-11-23T02:55:39.5418268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5418428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5418785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5418960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5419196Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnnipontj 2022-11-23T02:55:39.5419439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnnipontj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5419467Z 2022-11-23T02:55:39.5419557Z Running tests... 2022-11-23T02:55:39.5419801Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5420135Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5420419Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5420620Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91713 2022-11-23T02:55:39.5420818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91714 2022-11-23T02:55:39.5421013Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91715 2022-11-23T02:55:39.5421203Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91716 2022-11-23T02:55:39.5421543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5421750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5422121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5422294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5422632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5422791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5423139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5423428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5423775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5424364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5424743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5424922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5425277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5425441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5425795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5425970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5426218Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp71ofhrzy 2022-11-23T02:55:39.5426470Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp71ofhrzy/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5426718Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnzm1laiu 2022-11-23T02:55:39.5426973Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnzm1laiu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5427210Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbjbtejr7 2022-11-23T02:55:39.5427466Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbjbtejr7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5427704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpew84k1t9 2022-11-23T02:55:39.5427954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpew84k1t9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5428173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5428387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5428602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5428819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5428955Z fi_getinfo: -61 2022-11-23T02:55:39.5429078Z fi_getinfo: -61 2022-11-23T02:55:39.5429201Z fi_getinfo: -61 2022-11-23T02:55:39.5429323Z fi_getinfo: -61 2022-11-23T02:55:39.5429410Z ok (10.749s) 2022-11-23T02:55:39.5429431Z 2022-11-23T02:55:39.5429679Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5429780Z Ran 1 test in 10.749s 2022-11-23T02:55:39.5429800Z 2022-11-23T02:55:39.5429878Z OK 2022-11-23T02:55:39.5429897Z 2022-11-23T02:55:39.5430009Z Generating XML reports... 2022-11-23T02:55:39.5430545Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024231.xml 2022-11-23T02:55:39.5431131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5431299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5431658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5431824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5432061Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5g5qrzkj 2022-11-23T02:55:39.5432309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5g5qrzkj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5432328Z 2022-11-23T02:55:39.5432419Z Running tests... 2022-11-23T02:55:39.5432663Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5432994Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5433347Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5433547Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92232 2022-11-23T02:55:39.5433746Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92233 2022-11-23T02:55:39.5433935Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 92234 2022-11-23T02:55:39.5434128Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 92235 2022-11-23T02:55:39.5434473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5434632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5434986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5435166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5435505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5435663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5436003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5436176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5436511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5436667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5437010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5437178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5437524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5437679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5438021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5438366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5438611Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmjxbn79v 2022-11-23T02:55:39.5438871Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmjxbn79v/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5439111Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6hzf4j0c 2022-11-23T02:55:39.5439370Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6hzf4j0c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5439661Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr53wqp4n 2022-11-23T02:55:39.5439923Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr53wqp4n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5440162Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsspfirrw 2022-11-23T02:55:39.5440419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsspfirrw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5440629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5440844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5441218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5441425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5441558Z fi_getinfo: -61 2022-11-23T02:55:39.5441724Z fi_getinfo: -61 2022-11-23T02:55:39.5441841Z fi_getinfo: -61 2022-11-23T02:55:39.5441957Z fi_getinfo: -61 2022-11-23T02:55:39.5442043Z ok (10.775s) 2022-11-23T02:55:39.5442062Z 2022-11-23T02:55:39.5442306Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5442407Z Ran 1 test in 10.775s 2022-11-23T02:55:39.5442425Z 2022-11-23T02:55:39.5442504Z OK 2022-11-23T02:55:39.5442522Z 2022-11-23T02:55:39.5442630Z Generating XML reports... 2022-11-23T02:55:39.5443151Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024245.xml 2022-11-23T02:55:39.5443494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5443644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5444181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5444367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5444612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmple5ugl7n 2022-11-23T02:55:39.5444869Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmple5ugl7n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5444889Z 2022-11-23T02:55:39.5444983Z Running tests... 2022-11-23T02:55:39.5445233Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5445579Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5445874Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5446074Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92743 2022-11-23T02:55:39.5446276Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92744 2022-11-23T02:55:39.5446488Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 92745 2022-11-23T02:55:39.5446852Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 92746 2022-11-23T02:55:39.5447385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5447548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5447917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5448100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5448441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5448602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5449016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5449202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5449548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5449709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5450066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5450240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5450750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5450899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5451240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5451478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5451719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsqjo_d2d 2022-11-23T02:55:39.5451974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsqjo_d2d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5452382Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm0h0pxaz 2022-11-23T02:55:39.5452639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm0h0pxaz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5452881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuw9b1z0k 2022-11-23T02:55:39.5453127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuw9b1z0k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5453367Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjcburnrm 2022-11-23T02:55:39.5453628Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjcburnrm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5453845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5454061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5454273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5454485Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5454623Z fi_getinfo: -61 2022-11-23T02:55:39.5454747Z fi_getinfo: -61 2022-11-23T02:55:39.5454863Z fi_getinfo: -61 2022-11-23T02:55:39.5454985Z fi_getinfo: -61 2022-11-23T02:55:39.5455072Z ok (10.569s) 2022-11-23T02:55:39.5455091Z 2022-11-23T02:55:39.5455345Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5455451Z Ran 1 test in 10.570s 2022-11-23T02:55:39.5455471Z 2022-11-23T02:55:39.5455551Z OK 2022-11-23T02:55:39.5455573Z 2022-11-23T02:55:39.5455691Z Generating XML reports... 2022-11-23T02:55:39.5456221Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024258.xml 2022-11-23T02:55:39.5456583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5456748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5457114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5457293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5457537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgtg8dtlo 2022-11-23T02:55:39.5457797Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgtg8dtlo/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5457869Z 2022-11-23T02:55:39.5458132Z Running tests... 2022-11-23T02:55:39.5458382Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5458709Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5458996Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5459377Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93254 2022-11-23T02:55:39.5459582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93255 2022-11-23T02:55:39.5459785Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 93256 2022-11-23T02:55:39.5459984Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 93257 2022-11-23T02:55:39.5460342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5460575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5460937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5461116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5461466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5461628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5461988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5462165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5462511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5462680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5463038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5463206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5463574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5463760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5464322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5464657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5465067Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp28mciiu2 2022-11-23T02:55:39.5465333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp28mciiu2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5465577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvc94vkss 2022-11-23T02:55:39.5465827Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvc94vkss/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5466070Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppladmmra 2022-11-23T02:55:39.5466330Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppladmmra/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5466568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4evpn5qg 2022-11-23T02:55:39.5466820Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4evpn5qg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5467037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5467254Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5467542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5467812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5467950Z fi_getinfo: -61 2022-11-23T02:55:39.5468074Z fi_getinfo: -61 2022-11-23T02:55:39.5468195Z fi_getinfo: -61 2022-11-23T02:55:39.5468316Z fi_getinfo: -61 2022-11-23T02:55:39.5468405Z ok (10.456s) 2022-11-23T02:55:39.5468425Z 2022-11-23T02:55:39.5468676Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5468933Z Ran 1 test in 10.457s 2022-11-23T02:55:39.5468952Z 2022-11-23T02:55:39.5469022Z OK 2022-11-23T02:55:39.5469040Z 2022-11-23T02:55:39.5469149Z Generating XML reports... 2022-11-23T02:55:39.5469849Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024312.xml 2022-11-23T02:55:39.5470287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5470452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5470818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5470999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5471243Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp10ulrkn2 2022-11-23T02:55:39.5471499Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp10ulrkn2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5471519Z 2022-11-23T02:55:39.5471607Z Running tests... 2022-11-23T02:55:39.5471860Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5472206Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5472511Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5472719Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93765 2022-11-23T02:55:39.5472922Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93766 2022-11-23T02:55:39.5473129Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 93767 2022-11-23T02:55:39.5473326Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 93768 2022-11-23T02:55:39.5473677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5473841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5474205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5474547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5474888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5475043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5475391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5475563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5475903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5476054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5476406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5476580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5476967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5477129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5477476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5477645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5477881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt2zknksb 2022-11-23T02:55:39.5478127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt2zknksb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5478361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzcnykqk_ 2022-11-23T02:55:39.5478604Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzcnykqk_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5478885Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpckyh50cu 2022-11-23T02:55:39.5479130Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpckyh50cu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5479358Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7vxpn28w 2022-11-23T02:55:39.5479601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7vxpn28w/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5479810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5480018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5480218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5480428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5480564Z fi_getinfo: -61 2022-11-23T02:55:39.5480687Z fi_getinfo: -61 2022-11-23T02:55:39.5480809Z fi_getinfo: -61 2022-11-23T02:55:39.5480928Z fi_getinfo: -61 2022-11-23T02:55:39.5481012Z ok (10.591s) 2022-11-23T02:55:39.5481030Z 2022-11-23T02:55:39.5481269Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5481368Z Ran 1 test in 10.591s 2022-11-23T02:55:39.5481386Z 2022-11-23T02:55:39.5481462Z OK 2022-11-23T02:55:39.5481481Z 2022-11-23T02:55:39.5481589Z Generating XML reports... 2022-11-23T02:55:39.5482102Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024325.xml 2022-11-23T02:55:39.5482444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5482600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5482951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5483129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5483354Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuzdib7xt 2022-11-23T02:55:39.5483783Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuzdib7xt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5483802Z 2022-11-23T02:55:39.5483901Z Running tests... 2022-11-23T02:55:39.5484162Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5484511Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5484808Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5485015Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94276 2022-11-23T02:55:39.5485223Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94277 2022-11-23T02:55:39.5485466Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 94278 2022-11-23T02:55:39.5485678Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 94279 2022-11-23T02:55:39.5486039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5486204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5486724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5486898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5487419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5487578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5487991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5488162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5488509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5488672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5489030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5489204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5489560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5489721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5490229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5490391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5490623Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6uwwk_nm 2022-11-23T02:55:39.5491047Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6uwwk_nm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5491289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8003quh8 2022-11-23T02:55:39.5491544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8003quh8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5491783Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl5dsvpaj 2022-11-23T02:55:39.5492040Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl5dsvpaj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5492283Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyio66zye 2022-11-23T02:55:39.5492540Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyio66zye/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5492750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5492964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5493175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5493389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5493524Z fi_getinfo: -61 2022-11-23T02:55:39.5493649Z fi_getinfo: -61 2022-11-23T02:55:39.5493772Z fi_getinfo: -61 2022-11-23T02:55:39.5493888Z fi_getinfo: -61 2022-11-23T02:55:39.5493975Z ok (10.527s) 2022-11-23T02:55:39.5493994Z 2022-11-23T02:55:39.5494247Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5494351Z Ran 1 test in 10.527s 2022-11-23T02:55:39.5494370Z 2022-11-23T02:55:39.5494501Z OK 2022-11-23T02:55:39.5494522Z 2022-11-23T02:55:39.5494639Z Generating XML reports... 2022-11-23T02:55:39.5495176Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024338.xml 2022-11-23T02:55:39.5495532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5495695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5496054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5496235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5496639Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqjz6y7y1 2022-11-23T02:55:39.5497105Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqjz6y7y1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5497127Z 2022-11-23T02:55:39.5497231Z Running tests... 2022-11-23T02:55:39.5497488Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5497835Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5498132Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5498332Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94787 2022-11-23T02:55:39.5498542Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94788 2022-11-23T02:55:39.5498747Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 94789 2022-11-23T02:55:39.5498950Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 94790 2022-11-23T02:55:39.5499319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5499486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5499850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5500028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5500378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5500535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5500894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5501234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5501572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5501734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5502080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5502253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5502592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5502741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5503084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5503257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5503494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_p67u83i 2022-11-23T02:55:39.5503790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_p67u83i/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5504278Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpur8r6qfu 2022-11-23T02:55:39.5504537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpur8r6qfu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5504770Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi5c9mf9o 2022-11-23T02:55:39.5505014Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi5c9mf9o/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5505237Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphwqdc6dn 2022-11-23T02:55:39.5505488Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphwqdc6dn/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5505697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5506158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5506373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5506585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5506724Z fi_getinfo: -61 2022-11-23T02:55:39.5506849Z fi_getinfo: -61 2022-11-23T02:55:39.5506965Z fi_getinfo: -61 2022-11-23T02:55:39.5507092Z fi_getinfo: -61 2022-11-23T02:55:39.5507179Z ok (10.626s) 2022-11-23T02:55:39.5507198Z 2022-11-23T02:55:39.5507449Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5507550Z Ran 1 test in 10.626s 2022-11-23T02:55:39.5507569Z 2022-11-23T02:55:39.5507648Z OK 2022-11-23T02:55:39.5507667Z 2022-11-23T02:55:39.5507783Z Generating XML reports... 2022-11-23T02:55:39.5508323Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024352.xml 2022-11-23T02:55:39.5508685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5508852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5509217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5509396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5509640Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpowa4b888 2022-11-23T02:55:39.5510231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpowa4b888/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5510250Z 2022-11-23T02:55:39.5510346Z Running tests... 2022-11-23T02:55:39.5510605Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5510949Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5511249Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5511456Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95298 2022-11-23T02:55:39.5511664Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95299 2022-11-23T02:55:39.5511867Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 95300 2022-11-23T02:55:39.5512064Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 95301 2022-11-23T02:55:39.5512424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5512587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5512951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5513190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5513557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5513719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5514235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5514408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5514744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5514898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5515245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5515490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5515839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5515995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5516342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5516510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5516748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpic62z9d6 2022-11-23T02:55:39.5516995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpic62z9d6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5517227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpksih8381 2022-11-23T02:55:39.5517471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpksih8381/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5517877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsiksn0wg 2022-11-23T02:55:39.5518133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsiksn0wg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5518370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpos0e8f0g 2022-11-23T02:55:39.5518618Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpos0e8f0g/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5518831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5519046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5519259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5519471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5519604Z fi_getinfo: -61 2022-11-23T02:55:39.5519735Z fi_getinfo: -61 2022-11-23T02:55:39.5519857Z fi_getinfo: -61 2022-11-23T02:55:39.5519980Z fi_getinfo: -61 2022-11-23T02:55:39.5520066Z ok (10.623s) 2022-11-23T02:55:39.5520085Z 2022-11-23T02:55:39.5520340Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5520441Z Ran 1 test in 10.624s 2022-11-23T02:55:39.5520460Z 2022-11-23T02:55:39.5520538Z OK 2022-11-23T02:55:39.5520557Z 2022-11-23T02:55:39.5520663Z Generating XML reports... 2022-11-23T02:55:39.5521357Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024405.xml 2022-11-23T02:55:39.5521704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5521867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5522270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5522451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5522685Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphyzgek7f 2022-11-23T02:55:39.5523107Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphyzgek7f/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5523127Z 2022-11-23T02:55:39.5523216Z Running tests... 2022-11-23T02:55:39.5523472Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5523818Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5524114Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5524321Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95809 2022-11-23T02:55:39.5524589Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95810 2022-11-23T02:55:39.5524794Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 95811 2022-11-23T02:55:39.5524995Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 95812 2022-11-23T02:55:39.5525355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5525514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5525878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5526057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5526406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5526572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5526933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5527111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5527458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5527613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5527969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5528144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5528490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5528652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5529021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5529199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5529443Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt5gimdww 2022-11-23T02:55:39.5529703Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt5gimdww/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5529938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdc_uclbl 2022-11-23T02:55:39.5530191Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdc_uclbl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5530432Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6pey816q 2022-11-23T02:55:39.5530689Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6pey816q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5530977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxjzvmmkr 2022-11-23T02:55:39.5531391Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxjzvmmkr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5531603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5531812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5532009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5532215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5532345Z fi_getinfo: -61 2022-11-23T02:55:39.5532642Z fi_getinfo: -61 2022-11-23T02:55:39.5532767Z fi_getinfo: -61 2022-11-23T02:55:39.5532890Z fi_getinfo: -61 2022-11-23T02:55:39.5532977Z ok (10.494s) 2022-11-23T02:55:39.5532997Z 2022-11-23T02:55:39.5533249Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5533400Z Ran 1 test in 10.495s 2022-11-23T02:55:39.5533420Z 2022-11-23T02:55:39.5533499Z OK 2022-11-23T02:55:39.5533518Z 2022-11-23T02:55:39.5533631Z Generating XML reports... 2022-11-23T02:55:39.5534169Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024419.xml 2022-11-23T02:55:39.5534525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5534689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5535055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5535234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5535631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf7x6wg65 2022-11-23T02:55:39.5535885Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf7x6wg65/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5535904Z 2022-11-23T02:55:39.5535998Z Running tests... 2022-11-23T02:55:39.5536243Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5536576Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5536861Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5537061Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96320 2022-11-23T02:55:39.5537257Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96321 2022-11-23T02:55:39.5537451Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 96322 2022-11-23T02:55:39.5537640Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 96323 2022-11-23T02:55:39.5537988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5538145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5538679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5538856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5539203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5539364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5539721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5539893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5540293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5540462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5540822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5540999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5541510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5541668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5542011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5542182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5542412Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqjfbbd2b 2022-11-23T02:55:39.5542715Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqjfbbd2b/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5542946Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7vcjwi21 2022-11-23T02:55:39.5543190Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7vcjwi21/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5543423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqpumkioq 2022-11-23T02:55:39.5543671Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqpumkioq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5544263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp364nc0b 2022-11-23T02:55:39.5544525Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp364nc0b/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5544733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5544963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5545176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5545389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5545531Z fi_getinfo: -61 2022-11-23T02:55:39.5545654Z fi_getinfo: -61 2022-11-23T02:55:39.5545777Z fi_getinfo: -61 2022-11-23T02:55:39.5545902Z fi_getinfo: -61 2022-11-23T02:55:39.5545983Z ok (8.044s) 2022-11-23T02:55:39.5546002Z 2022-11-23T02:55:39.5546256Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5546355Z Ran 1 test in 8.044s 2022-11-23T02:55:39.5546374Z 2022-11-23T02:55:39.5546452Z OK 2022-11-23T02:55:39.5546471Z 2022-11-23T02:55:39.5546582Z Generating XML reports... 2022-11-23T02:55:39.5547268Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024432.xml 2022-11-23T02:55:39.5547621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5547780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5548316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5548495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5548737Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ejj4z04 2022-11-23T02:55:39.5548996Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ejj4z04/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5549016Z 2022-11-23T02:55:39.5549116Z Running tests... 2022-11-23T02:55:39.5549368Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5549788Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5550113Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5550319Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96827 2022-11-23T02:55:39.5550516Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96828 2022-11-23T02:55:39.5550715Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 96829 2022-11-23T02:55:39.5551067Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 96830 2022-11-23T02:55:39.5551416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5551574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5551924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5552159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5552497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5552645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5552991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5553161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5553494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5553651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5553993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5554179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5554521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5554677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5555013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5555182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5555419Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgvj5pg64 2022-11-23T02:55:39.5555845Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgvj5pg64/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5556090Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3w7iwja2 2022-11-23T02:55:39.5556352Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3w7iwja2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5556593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpct4ejlsb 2022-11-23T02:55:39.5556854Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpct4ejlsb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5557085Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3zvjqtt4 2022-11-23T02:55:39.5557340Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3zvjqtt4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5557555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5557769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5557981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5558188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5558536Z fi_getinfo: -61 2022-11-23T02:55:39.5558667Z fi_getinfo: -61 2022-11-23T02:55:39.5558782Z fi_getinfo: -61 2022-11-23T02:55:39.5558900Z fi_getinfo: -61 2022-11-23T02:55:39.5558984Z ok (10.607s) 2022-11-23T02:55:39.5559002Z 2022-11-23T02:55:39.5559245Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5559341Z Ran 1 test in 10.608s 2022-11-23T02:55:39.5559360Z 2022-11-23T02:55:39.5559438Z OK 2022-11-23T02:55:39.5559625Z 2022-11-23T02:55:39.5559853Z Generating XML reports... 2022-11-23T02:55:39.5560391Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024443.xml 2022-11-23T02:55:39.5560741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5560906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5561331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5561511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5561754Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgje90qdb 2022-11-23T02:55:39.5562017Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgje90qdb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5562036Z 2022-11-23T02:55:39.5562131Z Running tests... 2022-11-23T02:55:39.5562383Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5562727Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5563021Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5563387Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97346 2022-11-23T02:55:39.5563590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97347 2022-11-23T02:55:39.5563833Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 97348 2022-11-23T02:55:39.5564025Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 97349 2022-11-23T02:55:39.5564378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5564536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5564888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5565053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5565568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5565737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5566104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5566283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5566634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5566798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5567156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5567332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5567673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5567885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5568294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5568480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5568725Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyiwxemj3 2022-11-23T02:55:39.5568983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyiwxemj3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5569378Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1h_4g1y 2022-11-23T02:55:39.5569622Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1h_4g1y/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5569845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7the2e56 2022-11-23T02:55:39.5570087Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7the2e56/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5570369Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr90_0ny7 2022-11-23T02:55:39.5570608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr90_0ny7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5570817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5571024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5571231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5571436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5571566Z fi_getinfo: -61 2022-11-23T02:55:39.5571680Z fi_getinfo: -61 2022-11-23T02:55:39.5571797Z fi_getinfo: -61 2022-11-23T02:55:39.5572087Z fi_getinfo: -61 2022-11-23T02:55:39.5572174Z ok (8.140s) 2022-11-23T02:55:39.5572197Z 2022-11-23T02:55:39.5572452Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5572555Z Ran 1 test in 8.140s 2022-11-23T02:55:39.5572574Z 2022-11-23T02:55:39.5572653Z OK 2022-11-23T02:55:39.5572672Z 2022-11-23T02:55:39.5572777Z Generating XML reports... 2022-11-23T02:55:39.5573312Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024456.xml 2022-11-23T02:55:39.5573676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5573842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5574212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5574397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5574640Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdpiyuo63 2022-11-23T02:55:39.5575062Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdpiyuo63/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5575081Z 2022-11-23T02:55:39.5575177Z Running tests... 2022-11-23T02:55:39.5575414Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5575748Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5576046Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5576249Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97857 2022-11-23T02:55:39.5576447Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97858 2022-11-23T02:55:39.5576643Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 97859 2022-11-23T02:55:39.5576839Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 97860 2022-11-23T02:55:39.5577233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5577391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5577748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5577920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5578255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5578409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5578751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5578923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5579310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5579466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5579801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5579970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5580310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5580465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5580816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5580985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5581230Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdmzvo3ia 2022-11-23T02:55:39.5581482Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdmzvo3ia/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5581707Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjzud66om 2022-11-23T02:55:39.5581956Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjzud66om/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5582185Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoaf5lgub 2022-11-23T02:55:39.5582428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoaf5lgub/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5582656Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaw1d3hp5 2022-11-23T02:55:39.5582900Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaw1d3hp5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5583115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5583325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5583533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5583732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5584243Z fi_getinfo: -61 2022-11-23T02:55:39.5584380Z fi_getinfo: -61 2022-11-23T02:55:39.5584503Z fi_getinfo: -61 2022-11-23T02:55:39.5584624Z fi_getinfo: -61 2022-11-23T02:55:39.5584711Z ok (8.151s) 2022-11-23T02:55:39.5584730Z 2022-11-23T02:55:39.5584984Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5585076Z Ran 1 test in 8.151s 2022-11-23T02:55:39.5585095Z 2022-11-23T02:55:39.5585177Z OK 2022-11-23T02:55:39.5585196Z 2022-11-23T02:55:39.5585307Z Generating XML reports... 2022-11-23T02:55:39.5585911Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024507.xml 2022-11-23T02:55:39.5586278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5586443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5586959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5587132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5587537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpafgfloel 2022-11-23T02:55:39.5587790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpafgfloel/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5587810Z 2022-11-23T02:55:39.5587909Z Running tests... 2022-11-23T02:55:39.5588161Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5588590Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5588873Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5589078Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98368 2022-11-23T02:55:39.5589282Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98369 2022-11-23T02:55:39.5589484Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 98370 2022-11-23T02:55:39.5589677Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 98371 2022-11-23T02:55:39.5590036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5590199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5590568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5590750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5591101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5591263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5591619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5591799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5592142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5592304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5592659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5592841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5593195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5593355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5593865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5594034Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5594265Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz5uvr3ub 2022-11-23T02:55:39.5594512Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz5uvr3ub/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5594746Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4zshks0z 2022-11-23T02:55:39.5595040Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4zshks0z/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5595280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpovn7xs9p 2022-11-23T02:55:39.5595524Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpovn7xs9p/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5595752Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnmstqebw 2022-11-23T02:55:39.5595994Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnmstqebw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5596204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5596407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5596614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5596870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5597005Z fi_getinfo: -61 2022-11-23T02:55:39.5597126Z fi_getinfo: -61 2022-11-23T02:55:39.5597246Z fi_getinfo: -61 2022-11-23T02:55:39.5597544Z fi_getinfo: -61 2022-11-23T02:55:39.5597625Z ok (10.502s) 2022-11-23T02:55:39.5597644Z 2022-11-23T02:55:39.5597897Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5597997Z Ran 1 test in 10.502s 2022-11-23T02:55:39.5598016Z 2022-11-23T02:55:39.5598095Z OK 2022-11-23T02:55:39.5598114Z 2022-11-23T02:55:39.5598226Z Generating XML reports... 2022-11-23T02:55:39.5598763Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024518.xml 2022-11-23T02:55:39.5599121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5599289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5599660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5599832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5600075Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzhwbivca 2022-11-23T02:55:39.5600337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzhwbivca/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5600357Z 2022-11-23T02:55:39.5600454Z Running tests... 2022-11-23T02:55:39.5600708Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5601055Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5601345Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5601557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98887 2022-11-23T02:55:39.5601760Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98888 2022-11-23T02:55:39.5601965Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 98889 2022-11-23T02:55:39.5602165Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 98890 2022-11-23T02:55:39.5602526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5602690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5603052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5603238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5603592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5603805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5604164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5604501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5604836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5604991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5605333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5605503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5605841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5606043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5606381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5606550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5606784Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjmx_1g6e 2022-11-23T02:55:39.5607206Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjmx_1g6e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5607450Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbafeu4cr 2022-11-23T02:55:39.5607712Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbafeu4cr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5607952Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5f4a76r2 2022-11-23T02:55:39.5608203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5f4a76r2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5608450Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphpwnispt 2022-11-23T02:55:39.5608696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphpwnispt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5608912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5609128Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5609346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5609559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5609695Z fi_getinfo: -61 2022-11-23T02:55:39.5609818Z fi_getinfo: -61 2022-11-23T02:55:39.5609941Z fi_getinfo: -61 2022-11-23T02:55:39.5610219Z fi_getinfo: -61 2022-11-23T02:55:39.5610303Z ok (10.556s) 2022-11-23T02:55:39.5610325Z 2022-11-23T02:55:39.5610572Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5610670Z Ran 1 test in 10.556s 2022-11-23T02:55:39.5610688Z 2022-11-23T02:55:39.5610765Z OK 2022-11-23T02:55:39.5610785Z 2022-11-23T02:55:39.5610892Z Generating XML reports... 2022-11-23T02:55:39.5611408Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024532.xml 2022-11-23T02:55:39.5611756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5611908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5612258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5612435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5612672Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyka7bsny 2022-11-23T02:55:39.5612970Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyka7bsny/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5612990Z 2022-11-23T02:55:39.5613088Z Running tests... 2022-11-23T02:55:39.5613332Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5613666Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5613960Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5614161Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99406 2022-11-23T02:55:39.5614357Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99407 2022-11-23T02:55:39.5614553Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99408 2022-11-23T02:55:39.5614795Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99409 2022-11-23T02:55:39.5615144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5615303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5615656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5615832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5616164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5616318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5616664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5616839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5617174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5617333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5617685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5617857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5618185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5618341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5618691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5618863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5619107Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwmdo7gjh 2022-11-23T02:55:39.5619357Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwmdo7gjh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5619590Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzmb3hzar 2022-11-23T02:55:39.5619836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzmb3hzar/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5620066Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa9i71lrb 2022-11-23T02:55:39.5620303Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa9i71lrb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5620531Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpza0qecd6 2022-11-23T02:55:39.5620773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpza0qecd6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5621027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5621241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5621448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5621653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5621784Z fi_getinfo: -61 2022-11-23T02:55:39.5621897Z fi_getinfo: -61 2022-11-23T02:55:39.5622014Z fi_getinfo: -61 2022-11-23T02:55:39.5622131Z fi_getinfo: -61 2022-11-23T02:55:39.5622215Z ok (4.727s) 2022-11-23T02:55:39.5622234Z 2022-11-23T02:55:39.5622479Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5622579Z Ran 1 test in 4.728s 2022-11-23T02:55:39.5622598Z 2022-11-23T02:55:39.5622673Z OK 2022-11-23T02:55:39.5622691Z 2022-11-23T02:55:39.5622793Z Generating XML reports... 2022-11-23T02:55:39.5623364Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024545.xml 2022-11-23T02:55:39.5623707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5624049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5624420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5624592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5624826Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjhd19uht 2022-11-23T02:55:39.5625075Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjhd19uht/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5625094Z 2022-11-23T02:55:39.5625191Z Running tests... 2022-11-23T02:55:39.5625609Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5625959Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5626273Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5626483Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99753 2022-11-23T02:55:39.5626688Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99754 2022-11-23T02:55:39.5626892Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99755 2022-11-23T02:55:39.5627091Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99756 2022-11-23T02:55:39.5627450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5627614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5627979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5628158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5628509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5628671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5629032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5629210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5629554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5629715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5630147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5630335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5630692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5630854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5631211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5631546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5631780Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_blpc__x 2022-11-23T02:55:39.5632027Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_blpc__x/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5632257Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzixs_51g 2022-11-23T02:55:39.5632564Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzixs_51g/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5632798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpepqhq9zl 2022-11-23T02:55:39.5633046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpepqhq9zl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5633271Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_m94xkd9 2022-11-23T02:55:39.5633511Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_m94xkd9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5633719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5633926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5634129Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5634335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5634468Z fi_getinfo: -61 2022-11-23T02:55:39.5634588Z fi_getinfo: -61 2022-11-23T02:55:39.5634708Z fi_getinfo: -61 2022-11-23T02:55:39.5634827Z fi_getinfo: -61 2022-11-23T02:55:39.5634912Z ok (4.763s) 2022-11-23T02:55:39.5634930Z 2022-11-23T02:55:39.5635173Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5635264Z Ran 1 test in 4.763s 2022-11-23T02:55:39.5635290Z 2022-11-23T02:55:39.5635360Z OK 2022-11-23T02:55:39.5635378Z 2022-11-23T02:55:39.5635487Z Generating XML reports... 2022-11-23T02:55:39.5636005Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024553.xml 2022-11-23T02:55:39.5636347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5636509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5636865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5637041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5637278Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpph153cvh 2022-11-23T02:55:39.5637520Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpph153cvh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5637546Z 2022-11-23T02:55:39.5637633Z Running tests... 2022-11-23T02:55:39.5637875Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5638206Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5638498Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5638753Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100100 2022-11-23T02:55:39.5639144Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100101 2022-11-23T02:55:39.5639348Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100102 2022-11-23T02:55:39.5639546Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100103 2022-11-23T02:55:39.5639900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5640066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5640428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5640612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5640960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5641177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5641540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5641877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5642207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5642362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5642702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5642869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5643208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5643368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5643710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5643878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5644114Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsk0daolh 2022-11-23T02:55:39.5644358Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsk0daolh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5644776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnxwgfulk 2022-11-23T02:55:39.5645037Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnxwgfulk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5645277Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ydl5kvl 2022-11-23T02:55:39.5645535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ydl5kvl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5645778Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpygz4e6kp 2022-11-23T02:55:39.5646027Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpygz4e6kp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5646243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5646451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5646666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5646880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5647016Z fi_getinfo: -61 2022-11-23T02:55:39.5647141Z fi_getinfo: -61 2022-11-23T02:55:39.5647261Z fi_getinfo: -61 2022-11-23T02:55:39.5647543Z fi_getinfo: -61 2022-11-23T02:55:39.5647622Z ok (4.852s) 2022-11-23T02:55:39.5647646Z 2022-11-23T02:55:39.5647931Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5648036Z Ran 1 test in 4.852s 2022-11-23T02:55:39.5648055Z 2022-11-23T02:55:39.5648134Z OK 2022-11-23T02:55:39.5648152Z 2022-11-23T02:55:39.5648435Z Generating XML reports... 2022-11-23T02:55:39.5648972Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024600.xml 2022-11-23T02:55:39.5649331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5649495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5649860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5650032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5650339Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp63fjbal0 2022-11-23T02:55:39.5650595Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp63fjbal0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5650615Z 2022-11-23T02:55:39.5650710Z Running tests... 2022-11-23T02:55:39.5650963Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5651467Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5651749Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5651951Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100435 2022-11-23T02:55:39.5652149Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100436 2022-11-23T02:55:39.5652337Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100437 2022-11-23T02:55:39.5652538Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100438 2022-11-23T02:55:39.5652884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5653042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5653396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5653569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5653908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5654064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5654407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5654587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5654926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5655080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5655423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5655590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5656110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5656271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5656628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5656799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5657125Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbrldj7q7 2022-11-23T02:55:39.5657395Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbrldj7q7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5657636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjy1_68e1 2022-11-23T02:55:39.5657887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjy1_68e1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5658128Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb23jskeo 2022-11-23T02:55:39.5658381Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb23jskeo/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5658620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwi503d2d 2022-11-23T02:55:39.5659022Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwi503d2d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5659278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5659487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5659692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5659895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5660203Z fi_getinfo: -61 2022-11-23T02:55:39.5660328Z fi_getinfo: -61 2022-11-23T02:55:39.5660453Z fi_getinfo: -61 2022-11-23T02:55:39.5660569Z fi_getinfo: -61 2022-11-23T02:55:39.5660656Z ok (4.844s) 2022-11-23T02:55:39.5660675Z 2022-11-23T02:55:39.5660927Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5661026Z Ran 1 test in 4.844s 2022-11-23T02:55:39.5661045Z 2022-11-23T02:55:39.5661123Z OK 2022-11-23T02:55:39.5661145Z 2022-11-23T02:55:39.5661259Z Generating XML reports... 2022-11-23T02:55:39.5661795Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024608.xml 2022-11-23T02:55:39.5662161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5662319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5662685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5662863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5663102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp355fggwg 2022-11-23T02:55:39.5663356Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp355fggwg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5663376Z 2022-11-23T02:55:39.5663475Z Running tests... 2022-11-23T02:55:39.5664106Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5664457Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5664744Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5664940Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100782 2022-11-23T02:55:39.5665141Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100783 2022-11-23T02:55:39.5665340Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100784 2022-11-23T02:55:39.5665535Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100785 2022-11-23T02:55:39.5666057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5666228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5666662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5666852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5667197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5667357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5667715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5667942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5668291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5668450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5668876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5669053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5669401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5669710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5670059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5670229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5670463Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwr33dqt9 2022-11-23T02:55:39.5670714Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwr33dqt9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5670956Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8s3mzdrv 2022-11-23T02:55:39.5671204Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8s3mzdrv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5671437Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwslvq20r 2022-11-23T02:55:39.5671674Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwslvq20r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5671904Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfhjaa3by 2022-11-23T02:55:39.5672146Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfhjaa3by/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5672532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5672750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5672962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5673182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5673318Z fi_getinfo: -61 2022-11-23T02:55:39.5673438Z fi_getinfo: -61 2022-11-23T02:55:39.5673561Z fi_getinfo: -61 2022-11-23T02:55:39.5673684Z fi_getinfo: -61 2022-11-23T02:55:39.5673772Z ok (6.854s) 2022-11-23T02:55:39.5673792Z 2022-11-23T02:55:39.5674046Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5674145Z Ran 1 test in 6.854s 2022-11-23T02:55:39.5674165Z 2022-11-23T02:55:39.5674246Z OK 2022-11-23T02:55:39.5674265Z 2022-11-23T02:55:39.5674378Z Generating XML reports... 2022-11-23T02:55:39.5674904Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024616.xml 2022-11-23T02:55:39.5675414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5675624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5675988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5676161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5676394Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphzl86lz6 2022-11-23T02:55:39.5676641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphzl86lz6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5676660Z 2022-11-23T02:55:39.5676758Z Running tests... 2022-11-23T02:55:39.5677004Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5677334Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5677630Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5677884Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101289 2022-11-23T02:55:39.5678082Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101290 2022-11-23T02:55:39.5678279Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 101291 2022-11-23T02:55:39.5678473Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 101292 2022-11-23T02:55:39.5678820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5678980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5679327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5679500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5679846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5680005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5680355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5680530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5680866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5681025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5681374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5681539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5681879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5682041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5682385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5682554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5682789Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp90_224bk 2022-11-23T02:55:39.5683035Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp90_224bk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5683267Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe64ch9a8 2022-11-23T02:55:39.5683500Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe64ch9a8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5683733Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpokv39h62 2022-11-23T02:55:39.5684045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpokv39h62/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5684460Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphpebb44r 2022-11-23T02:55:39.5684717Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphpebb44r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5684931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5685147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5685360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5685577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5685705Z fi_getinfo: -61 2022-11-23T02:55:39.5685830Z fi_getinfo: -61 2022-11-23T02:55:39.5686009Z fi_getinfo: -61 2022-11-23T02:55:39.5686130Z fi_getinfo: -61 2022-11-23T02:55:39.5686218Z ok (6.925s) 2022-11-23T02:55:39.5686241Z 2022-11-23T02:55:39.5686498Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5686597Z Ran 1 test in 6.925s 2022-11-23T02:55:39.5686616Z 2022-11-23T02:55:39.5686690Z OK 2022-11-23T02:55:39.5686708Z 2022-11-23T02:55:39.5686822Z Generating XML reports... 2022-11-23T02:55:39.5687506Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024625.xml 2022-11-23T02:55:39.5688083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5688247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5688614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5688797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5689043Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_he4nnp6 2022-11-23T02:55:39.5689300Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_he4nnp6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5689320Z 2022-11-23T02:55:39.5689410Z Running tests... 2022-11-23T02:55:39.5689663Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5690006Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5690321Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5690530Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101796 2022-11-23T02:55:39.5690888Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101797 2022-11-23T02:55:39.5691085Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 101798 2022-11-23T02:55:39.5691456Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 101799 2022-11-23T02:55:39.5691810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5691975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5692340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5692520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5692869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5693031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5693382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5693595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5693971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5694301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5694649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5694819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5695158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5695314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5695656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5695873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5696106Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_pj4cf4c 2022-11-23T02:55:39.5696348Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_pj4cf4c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5696582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpswg8xqkm 2022-11-23T02:55:39.5696834Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpswg8xqkm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5697071Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqp56os6b 2022-11-23T02:55:39.5697317Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqp56os6b/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5697548Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7o7q3g8 2022-11-23T02:55:39.5697974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7o7q3g8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5698196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5698411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5698611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5698821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5698958Z fi_getinfo: -61 2022-11-23T02:55:39.5699081Z fi_getinfo: -61 2022-11-23T02:55:39.5699205Z fi_getinfo: -61 2022-11-23T02:55:39.5699331Z fi_getinfo: -61 2022-11-23T02:55:39.5699422Z ok (6.729s) 2022-11-23T02:55:39.5699442Z 2022-11-23T02:55:39.5699688Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5699788Z Ran 1 test in 6.729s 2022-11-23T02:55:39.5699808Z 2022-11-23T02:55:39.5699890Z OK 2022-11-23T02:55:39.5699909Z 2022-11-23T02:55:39.5700020Z Generating XML reports... 2022-11-23T02:55:39.5700714Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024635.xml 2022-11-23T02:55:39.5701064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5701221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5701572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5701743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5701969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp72xc5x9g 2022-11-23T02:55:39.5702214Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp72xc5x9g/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5702237Z 2022-11-23T02:55:39.5702329Z Running tests... 2022-11-23T02:55:39.5702619Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5702964Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5703261Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5703460Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102303 2022-11-23T02:55:39.5703661Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102304 2022-11-23T02:55:39.5704198Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 102305 2022-11-23T02:55:39.5704417Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 102306 2022-11-23T02:55:39.5704785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5705026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5705398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5705583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5705932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5706098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5706459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5706631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5706979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5707142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5707503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5707680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5708033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5708194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5708549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5708723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5708959Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1zah3xz3 2022-11-23T02:55:39.5709219Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1zah3xz3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5709467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxv0r3bqt 2022-11-23T02:55:39.5709725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxv0r3bqt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5709967Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd1hwp194 2022-11-23T02:55:39.5710220Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd1hwp194/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5710616Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzkc8o64c 2022-11-23T02:55:39.5710860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzkc8o64c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5711066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5711277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5711544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5711758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5711888Z fi_getinfo: -61 2022-11-23T02:55:39.5712007Z fi_getinfo: -61 2022-11-23T02:55:39.5712123Z fi_getinfo: -61 2022-11-23T02:55:39.5712242Z fi_getinfo: -61 2022-11-23T02:55:39.5712320Z ok (6.834s) 2022-11-23T02:55:39.5712339Z 2022-11-23T02:55:39.5712581Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5712678Z Ran 1 test in 6.834s 2022-11-23T02:55:39.5712696Z 2022-11-23T02:55:39.5712773Z OK 2022-11-23T02:55:39.5712791Z 2022-11-23T02:55:39.5712899Z Generating XML reports... 2022-11-23T02:55:39.5713413Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024645.xml 2022-11-23T02:55:39.5713823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5713985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5714333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5714505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5714739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph4ck9v1o 2022-11-23T02:55:39.5714987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph4ck9v1o/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5715006Z 2022-11-23T02:55:39.5715097Z Running tests... 2022-11-23T02:55:39.5715342Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5715673Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5715988Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5716192Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102810 2022-11-23T02:55:39.5716385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102811 2022-11-23T02:55:39.5716585Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 102812 2022-11-23T02:55:39.5716780Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 102813 2022-11-23T02:55:39.5717127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5717286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5717641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5717823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5718165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5718317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5718669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5718840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5719173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5719327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5719681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5719850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5720238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5720401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5720743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5720912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5721147Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv81spk4q 2022-11-23T02:55:39.5721398Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv81spk4q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5721633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppqc8irjk 2022-11-23T02:55:39.5721882Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppqc8irjk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5722161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjo_4nom5 2022-11-23T02:55:39.5722407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjo_4nom5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5722632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptv9d5vjm 2022-11-23T02:55:39.5722879Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptv9d5vjm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5723087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5723373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5723607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5723813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5723945Z fi_getinfo: -61 2022-11-23T02:55:39.5724070Z fi_getinfo: -61 2022-11-23T02:55:39.5724182Z fi_getinfo: -61 2022-11-23T02:55:39.5724303Z fi_getinfo: -61 2022-11-23T02:55:39.5724388Z ok (6.806s) 2022-11-23T02:55:39.5724406Z 2022-11-23T02:55:39.5724654Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5724755Z Ran 1 test in 6.806s 2022-11-23T02:55:39.5724773Z 2022-11-23T02:55:39.5724848Z OK 2022-11-23T02:55:39.5724866Z 2022-11-23T02:55:39.5724975Z Generating XML reports... 2022-11-23T02:55:39.5725489Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024654.xml 2022-11-23T02:55:39.5726011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5726177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5726542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5726728Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5726973Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4k7tj1pa 2022-11-23T02:55:39.5727229Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4k7tj1pa/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5727249Z 2022-11-23T02:55:39.5727347Z Running tests... 2022-11-23T02:55:39.5727602Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5727942Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5728254Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5728462Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103317 2022-11-23T02:55:39.5728669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103318 2022-11-23T02:55:39.5728927Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 103319 2022-11-23T02:55:39.5729140Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 103320 2022-11-23T02:55:39.5729504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5729671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5730037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5730212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5730567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5730730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5731143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5731322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5731675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5731996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5732335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5732495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5732840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5733012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5733363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5733538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5733773Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3mf_54ow 2022-11-23T02:55:39.5734020Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3mf_54ow/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5734254Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp5r_bik3 2022-11-23T02:55:39.5734502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp5r_bik3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5734723Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuk2igjbq 2022-11-23T02:55:39.5734955Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpawvpriqf 2022-11-23T02:55:39.5735197Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuk2igjbq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5735443Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpawvpriqf/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5735649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5735848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5736053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5736260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5736391Z fi_getinfo: -61 2022-11-23T02:55:39.5736507Z fi_getinfo: -61 2022-11-23T02:55:39.5736630Z fi_getinfo: -61 2022-11-23T02:55:39.5736746Z fi_getinfo: -61 2022-11-23T02:55:39.5736832Z ok (6.846s) 2022-11-23T02:55:39.5736851Z 2022-11-23T02:55:39.5737095Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5737196Z Ran 1 test in 6.846s 2022-11-23T02:55:39.5737215Z 2022-11-23T02:55:39.5737292Z OK 2022-11-23T02:55:39.5737356Z 2022-11-23T02:55:39.5737464Z Generating XML reports... 2022-11-23T02:55:39.5737984Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024704.xml 2022-11-23T02:55:39.5738329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5738486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5738836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5739011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5739424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpklr56k8s 2022-11-23T02:55:39.5739679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpklr56k8s/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5739745Z 2022-11-23T02:55:39.5739847Z Running tests... 2022-11-23T02:55:39.5740096Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5740440Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5740755Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5740964Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103824 2022-11-23T02:55:39.5741167Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103825 2022-11-23T02:55:39.5741371Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 103826 2022-11-23T02:55:39.5741572Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 103827 2022-11-23T02:55:39.5741939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5742259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5742615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5742788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5743131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5743287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5743635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5743804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5744335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5744502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5744842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5745194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5745543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5745702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5746064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5746236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5746484Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu19pquyj 2022-11-23T02:55:39.5746815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu19pquyj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5747060Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz3_unice 2022-11-23T02:55:39.5747319Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz3_unice/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5747558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmap7zrzy 2022-11-23T02:55:39.5747813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmap7zrzy/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5748048Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbcb35jbw 2022-11-23T02:55:39.5748301Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbcb35jbw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5748514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5748782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5748999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5749199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5749334Z fi_getinfo: -61 2022-11-23T02:55:39.5749458Z fi_getinfo: -61 2022-11-23T02:55:39.5749581Z fi_getinfo: -61 2022-11-23T02:55:39.5749702Z fi_getinfo: -61 2022-11-23T02:55:39.5749790Z ok (6.945s) 2022-11-23T02:55:39.5749809Z 2022-11-23T02:55:39.5750062Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5750155Z Ran 1 test in 6.945s 2022-11-23T02:55:39.5750174Z 2022-11-23T02:55:39.5750253Z OK 2022-11-23T02:55:39.5750272Z 2022-11-23T02:55:39.5750383Z Generating XML reports... 2022-11-23T02:55:39.5750918Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024714.xml 2022-11-23T02:55:39.5751288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5751452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5751973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5752148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5752385Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ox3ekxs 2022-11-23T02:55:39.5752805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ox3ekxs/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5752824Z 2022-11-23T02:55:39.5752923Z Running tests... 2022-11-23T02:55:39.5753177Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5753523Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5753820Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5754029Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104331 2022-11-23T02:55:39.5754237Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104332 2022-11-23T02:55:39.5754440Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 104333 2022-11-23T02:55:39.5754636Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 104334 2022-11-23T02:55:39.5754997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5755163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5755529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5755754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5756116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5756277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5756635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5756814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5757160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5757319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5757673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5757899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5758253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5758412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5758928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5759097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5759329Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw7953mo_ 2022-11-23T02:55:39.5759575Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw7953mo_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5759808Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbi0upl_h 2022-11-23T02:55:39.5760037Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8wf0ms0i 2022-11-23T02:55:39.5760291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbi0upl_h/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5760708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8wf0ms0i/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5760947Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp18iag94m 2022-11-23T02:55:39.5761200Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp18iag94m/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5761415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5761623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5761836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5762040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5762179Z fi_getinfo: -61 2022-11-23T02:55:39.5762303Z fi_getinfo: -61 2022-11-23T02:55:39.5762427Z fi_getinfo: -61 2022-11-23T02:55:39.5762551Z fi_getinfo: -61 2022-11-23T02:55:39.5762631Z ok (10.738s) 2022-11-23T02:55:39.5762650Z 2022-11-23T02:55:39.5762903Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5763003Z Ran 1 test in 10.738s 2022-11-23T02:55:39.5763022Z 2022-11-23T02:55:39.5763103Z OK 2022-11-23T02:55:39.5763122Z 2022-11-23T02:55:39.5763233Z Generating XML reports... 2022-11-23T02:55:39.5763806Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024724.xml 2022-11-23T02:55:39.5764328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5764488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5764890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5765073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5765309Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp18221f7d 2022-11-23T02:55:39.5765555Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp18221f7d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5765574Z 2022-11-23T02:55:39.5765665Z Running tests... 2022-11-23T02:55:39.5765912Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5766415Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5766711Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5766919Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104850 2022-11-23T02:55:39.5767188Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104851 2022-11-23T02:55:39.5767395Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 104852 2022-11-23T02:55:39.5767597Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 104853 2022-11-23T02:55:39.5768009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5768175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5768541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5768720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5769070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5769234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5769594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5769923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5770444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5770603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5770960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5771136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5771490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5771649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5772012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5772189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5772434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5395qr2u 2022-11-23T02:55:39.5772691Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5395qr2u/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5772934Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv54ia4ox 2022-11-23T02:55:39.5773186Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv54ia4ox/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5773426Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnmzkbl3z 2022-11-23T02:55:39.5773679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnmzkbl3z/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5773921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpza_7va7n 2022-11-23T02:55:39.5774212Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpza_7va7n/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5774438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5774655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5774869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5775080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5775213Z fi_getinfo: -61 2022-11-23T02:55:39.5775338Z fi_getinfo: -61 2022-11-23T02:55:39.5775620Z fi_getinfo: -61 2022-11-23T02:55:39.5775734Z fi_getinfo: -61 2022-11-23T02:55:39.5775818Z ok (10.617s) 2022-11-23T02:55:39.5775836Z 2022-11-23T02:55:39.5776081Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5776240Z Ran 1 test in 10.618s 2022-11-23T02:55:39.5776263Z 2022-11-23T02:55:39.5776340Z OK 2022-11-23T02:55:39.5776359Z 2022-11-23T02:55:39.5776468Z Generating XML reports... 2022-11-23T02:55:39.5776983Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024737.xml 2022-11-23T02:55:39.5777327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5777478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5777830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5778002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5778233Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5_ejm2r0 2022-11-23T02:55:39.5778485Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5_ejm2r0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5778504Z 2022-11-23T02:55:39.5778597Z Running tests... 2022-11-23T02:55:39.5778841Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5779171Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5779444Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5779645Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105361 2022-11-23T02:55:39.5779845Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105362 2022-11-23T02:55:39.5780041Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 105363 2022-11-23T02:55:39.5780231Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 105364 2022-11-23T02:55:39.5780581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5780741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5781094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5781268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5781599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5781755Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5782102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5782273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5782659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5782821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5783166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5783338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5783671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5783827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5784372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5784544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5784963Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2bs7sznu 2022-11-23T02:55:39.5785304Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2bs7sznu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5785544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0qnd3s2d 2022-11-23T02:55:39.5785805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0qnd3s2d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5786045Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptc527m2y 2022-11-23T02:55:39.5786293Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptc527m2y/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5786532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphtnt6yw2 2022-11-23T02:55:39.5786781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphtnt6yw2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5786997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5787220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5787434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5787803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5787937Z fi_getinfo: -61 2022-11-23T02:55:39.5788016Z ok (4.526s) 2022-11-23T02:55:39.5788035Z 2022-11-23T02:55:39.5788458Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5788556Z Ran 1 test in 4.526s 2022-11-23T02:55:39.5788576Z 2022-11-23T02:55:39.5788655Z OK 2022-11-23T02:55:39.5788673Z 2022-11-23T02:55:39.5788784Z Generating XML reports... 2022-11-23T02:55:39.5789316Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024751.xml 2022-11-23T02:55:39.5789673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5789851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5790219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5790394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5790640Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1k5ys0l3 2022-11-23T02:55:39.5790896Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1k5ys0l3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5790916Z 2022-11-23T02:55:39.5791166Z Running tests... 2022-11-23T02:55:39.5791596Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5791942Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5792293Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5792510Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105693 2022-11-23T02:55:39.5792711Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105694 2022-11-23T02:55:39.5792915Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 105695 2022-11-23T02:55:39.5793114Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 105696 2022-11-23T02:55:39.5793477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5793642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5794008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5794189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5794636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5794800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5795158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5795340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5795687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5795848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5796206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5796382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5796904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5797063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5797402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5797571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5797807Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0tvfzdmp 2022-11-23T02:55:39.5798233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0tvfzdmp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5798475Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ziob88r 2022-11-23T02:55:39.5798729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ziob88r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5798969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuwq0oa74 2022-11-23T02:55:39.5799227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuwq0oa74/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5799462Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuut4lgyh 2022-11-23T02:55:39.5799706Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuut4lgyh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5799920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5800134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5800349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5800560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5800695Z fi_getinfo: -61 2022-11-23T02:55:39.5800819Z fi_getinfo: -61 2022-11-23T02:55:39.5800947Z fi_getinfo: -61 2022-11-23T02:55:39.5801064Z fi_getinfo: -61 2022-11-23T02:55:39.5801198Z ok (10.523s) 2022-11-23T02:55:39.5801218Z 2022-11-23T02:55:39.5801634Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5801733Z Ran 1 test in 10.523s 2022-11-23T02:55:39.5801751Z 2022-11-23T02:55:39.5801827Z OK 2022-11-23T02:55:39.5801846Z 2022-11-23T02:55:39.5801954Z Generating XML reports... 2022-11-23T02:55:39.5802474Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024758.xml 2022-11-23T02:55:39.5802825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5802976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5803328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5803562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5803800Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcg8odgyb 2022-11-23T02:55:39.5804053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcg8odgyb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5804072Z 2022-11-23T02:55:39.5804165Z Running tests... 2022-11-23T02:55:39.5804410Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5804743Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5805023Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5805227Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106212 2022-11-23T02:55:39.5805425Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106213 2022-11-23T02:55:39.5805626Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 106214 2022-11-23T02:55:39.5805819Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 106215 2022-11-23T02:55:39.5806167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5806502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5806868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5807048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5807393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5807554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5807912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5808095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5808442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5808604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5808956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5809132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5809469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5809631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5809989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5810221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5810470Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn_7qkn85 2022-11-23T02:55:39.5810725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn_7qkn85/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5810967Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpokr7m7uk 2022-11-23T02:55:39.5811220Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpokr7m7uk/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5811457Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_6xlhcfc 2022-11-23T02:55:39.5811701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_6xlhcfc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5811939Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpajteouwb 2022-11-23T02:55:39.5812241Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpajteouwb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5812457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5812672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5812886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5813098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5813237Z fi_getinfo: -61 2022-11-23T02:55:39.5813355Z fi_getinfo: -61 2022-11-23T02:55:39.5813480Z fi_getinfo: -61 2022-11-23T02:55:39.5813601Z fi_getinfo: -61 2022-11-23T02:55:39.5813690Z ok (15.957s) 2022-11-23T02:55:39.5813709Z 2022-11-23T02:55:39.5813962Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5814063Z Ran 1 test in 15.957s 2022-11-23T02:55:39.5814086Z 2022-11-23T02:55:39.5814165Z OK 2022-11-23T02:55:39.5814348Z 2022-11-23T02:55:39.5814462Z Generating XML reports... 2022-11-23T02:55:39.5814978Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024811.xml 2022-11-23T02:55:39.5815324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5815482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5815835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5816008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5816241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9rfhn3t6 2022-11-23T02:55:39.5816490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9rfhn3t6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5816512Z 2022-11-23T02:55:39.5816605Z Running tests... 2022-11-23T02:55:39.5816845Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5817185Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5817478Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5817680Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106747 2022-11-23T02:55:39.5817879Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106748 2022-11-23T02:55:39.5818252Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 106749 2022-11-23T02:55:39.5818452Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 106750 2022-11-23T02:55:39.5818813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5819030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5819400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5819581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5819937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5820098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5820456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5820633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5820980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5821189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5821706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5821877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5822216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5822369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5822709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5822880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5823114Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9cdpnja3 2022-11-23T02:55:39.5823543Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9cdpnja3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5823794Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvtdv5ej2 2022-11-23T02:55:39.5824232Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvtdv5ej2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5824478Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm7tm54kg 2022-11-23T02:55:39.5824726Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm7tm54kg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5824964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp62szhcw8 2022-11-23T02:55:39.5825213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp62szhcw8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5825426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5825641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5825861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5826067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5826202Z fi_getinfo: -61 2022-11-23T02:55:39.5826328Z fi_getinfo: -61 2022-11-23T02:55:39.5826452Z fi_getinfo: -61 2022-11-23T02:55:39.5826575Z fi_getinfo: -61 2022-11-23T02:55:39.5826661Z ok (15.538s) 2022-11-23T02:55:39.5826681Z 2022-11-23T02:55:39.5826935Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5827036Z Ran 1 test in 15.538s 2022-11-23T02:55:39.5827056Z 2022-11-23T02:55:39.5827128Z OK 2022-11-23T02:55:39.5827147Z 2022-11-23T02:55:39.5827258Z Generating XML reports... 2022-11-23T02:55:39.5827792Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024830.xml 2022-11-23T02:55:39.5828223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5828398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5828765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5828943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5829186Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc3hmot64 2022-11-23T02:55:39.5829435Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc3hmot64/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5829462Z 2022-11-23T02:55:39.5829551Z Running tests... 2022-11-23T02:55:39.5829801Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5830145Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5830509Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5830718Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107266 2022-11-23T02:55:39.5830924Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107267 2022-11-23T02:55:39.5831129Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107268 2022-11-23T02:55:39.5831329Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107269 2022-11-23T02:55:39.5831682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5831848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5832373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5832724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5833074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5833236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5833595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5833775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5834114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5834276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5834634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5834810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5835171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5835332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5835686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5835861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5836104Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp55ve7ztp 2022-11-23T02:55:39.5836355Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp55ve7ztp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5836597Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgknw5nvj 2022-11-23T02:55:39.5837004Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgknw5nvj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5837290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0qkk8elx 2022-11-23T02:55:39.5837541Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0qkk8elx/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5837769Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8q8wxymq 2022-11-23T02:55:39.5838012Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8q8wxymq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5838223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5838430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5838631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5838839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5838972Z fi_getinfo: -61 2022-11-23T02:55:39.5839152Z fi_getinfo: -61 2022-11-23T02:55:39.5839271Z fi_getinfo: -61 2022-11-23T02:55:39.5839392Z fi_getinfo: -61 2022-11-23T02:55:39.5839478Z ok (4.847s) 2022-11-23T02:55:39.5839497Z 2022-11-23T02:55:39.5839921Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5840022Z Ran 1 test in 4.847s 2022-11-23T02:55:39.5840041Z 2022-11-23T02:55:39.5840119Z OK 2022-11-23T02:55:39.5840138Z 2022-11-23T02:55:39.5840250Z Generating XML reports... 2022-11-23T02:55:39.5840784Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024849.xml 2022-11-23T02:55:39.5841139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5841302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5841665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5841843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5842085Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd8i5nwey 2022-11-23T02:55:39.5842343Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd8i5nwey/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5842363Z 2022-11-23T02:55:39.5842620Z Running tests... 2022-11-23T02:55:39.5843037Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5843384Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5843667Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5843873Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107613 2022-11-23T02:55:39.5844078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107614 2022-11-23T02:55:39.5844284Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107615 2022-11-23T02:55:39.5844487Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107616 2022-11-23T02:55:39.5844846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5845012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5845380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5845560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5845912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5846075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5846470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5846640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5847002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5847181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5847544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5847720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5848228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5848387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5848733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5848946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5849366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo5vw7ogv 2022-11-23T02:55:39.5849626Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo5vw7ogv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5849865Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcx2h03ff 2022-11-23T02:55:39.5850118Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcx2h03ff/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5850359Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmponvgr38_ 2022-11-23T02:55:39.5850609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmponvgr38_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5850844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr8j9_gir 2022-11-23T02:55:39.5851099Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr8j9_gir/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5851310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5851523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5851736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5852112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5852242Z fi_getinfo: -61 2022-11-23T02:55:39.5852362Z fi_getinfo: -61 2022-11-23T02:55:39.5852480Z fi_getinfo: -61 2022-11-23T02:55:39.5852594Z fi_getinfo: -61 2022-11-23T02:55:39.5852710Z On WorkerInfo(id=1, name=worker1): 2022-11-23T02:55:39.5865416Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb36991d59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fb369918dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7fb375b6413d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fb375b6567f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fb375b66ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fb375e5bb7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a0be3e (0x7fb36c7d3e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: + 0x2a0bf46 (0x7fb36c7d3f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fb37696dc58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x35efc70 (0x7fb3781dfc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x35f03e9 (0x7fb3781e03e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fb3769a7e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff562 (0x7fb3818ce562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ff956 (0x7fb3818ce956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x559278e6dc68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x559278e29499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x559278e295fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x559278dd54b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x559278e72098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x559278e1f742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x559278dd7faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x559278e73774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x559278e1f742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x559278dd7faa in /opt/conda/bin/python)\nframe #24: + 0xaa8dba (0x7fb382077dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb382075ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb3820792d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fb38207ab16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fb37961b5cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb3820790c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x4a24a53 (0x7fb379614a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb3796155e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb37960f8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x4a545d2 (0x7fb3796445d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb36990b90b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7fb399771bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7fb3b9dc56db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7fb3b9aee61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-11-23T02:55:39.5865646Z Traceback (most recent call last): 2022-11-23T02:55:39.5866042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.5866219Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.5866771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5954, in _gpu_add_wrong_gpus 2022-11-23T02:55:39.5866877Z return x.cpu() + y.cuda() 2022-11-23T02:55:39.5867111Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-11-23T02:55:39.5867383Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-11-23T02:55:39.5867980Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb36991d59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5868597Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fb369918dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5869083Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7fb375b6413d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5869536Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fb375b6567f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5870093Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fb375b66ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5870732Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fb375e5bb7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5871101Z frame #6: + 0x2a0be3e (0x7fb36c7d3e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5871460Z frame #7: + 0x2a0bf46 (0x7fb36c7d3f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5871980Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fb37696dc58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5872330Z frame #9: + 0x35efc70 (0x7fb3781dfc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5872676Z frame #10: + 0x35f03e9 (0x7fb3781e03e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5873365Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fb3769a7e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5873755Z frame #12: + 0x2ff562 (0x7fb3818ce562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5874125Z frame #13: + 0x2ff956 (0x7fb3818ce956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5874309Z frame #14: + 0x1ddc68 (0x559278e6dc68 in /opt/conda/bin/python) 2022-11-23T02:55:39.5874484Z frame #15: + 0x199499 (0x559278e29499 in /opt/conda/bin/python) 2022-11-23T02:55:39.5874657Z frame #16: + 0x1995fa (0x559278e295fa in /opt/conda/bin/python) 2022-11-23T02:55:39.5874828Z frame #17: PyNumber_Add + 0x41 (0x559278dd54b1 in /opt/conda/bin/python) 2022-11-23T02:55:39.5875022Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x559278e72098 in /opt/conda/bin/python) 2022-11-23T02:55:39.5875244Z frame #19: + 0x18f742 (0x559278e1f742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5875421Z frame #20: _PyObject_Call + 0x20a (0x559278dd7faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5875614Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x559278e73774 in /opt/conda/bin/python) 2022-11-23T02:55:39.5875788Z frame #22: + 0x18f742 (0x559278e1f742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5876117Z frame #23: _PyObject_Call + 0x20a (0x559278dd7faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5876483Z frame #24: + 0xaa8dba (0x7fb382077dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5876973Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb382075ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5877595Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb3820792d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5878250Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fb38207ab16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5878971Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fb37961b5cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5879707Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb3820790c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5880065Z frame #30: + 0x4a24a53 (0x7fb379614a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5880698Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb3796155e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5881284Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb37960f8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5881636Z frame #33: + 0x4a545d2 (0x7fb3796445d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5882074Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb36990b90b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5882282Z frame #35: + 0xdbbf4 (0x7fb399771bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.5882583Z frame #36: + 0x76db (0x7fb3b9dc56db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.5882832Z frame #37: clone + 0x3f (0x7fb3b9aee61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.5882852Z 2022-11-23T02:55:39.5882870Z 2022-11-23T02:55:39.5882988Z On WorkerInfo(id=0, name=worker0): 2022-11-23T02:55:39.5895831Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f882f1e359b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f882f1dedfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f883b42a13d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f883b42b67f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f883b42cef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f883b721b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a0be3e (0x7f8832099e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: + 0x2a0bf46 (0x7f8832099f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f883c233c58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x35efc70 (0x7f883daa5c70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x35f03e9 (0x7f883daa63e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f883c26de62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff562 (0x7f8847194562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ff956 (0x7f8847194956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55e572bcfc68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55e572b8b499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55e572b8b5fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55e572b374b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55e572bd4098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55e572b81742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55e572b39faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55e572bd5774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55e572b81742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55e572b39faa in /opt/conda/bin/python)\nframe #24: + 0xaa8dba (0x7f884793ddba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f884793bffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f884793f2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f8847940b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f883eee15cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f884793f0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x4a24a53 (0x7f883eedaa53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f883eedb5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f883eed58e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x4a545d2 (0x7f883ef0a5d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f882f1d190b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7f885f037bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f887f68b6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f887f3b461f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-11-23T02:55:39.5896064Z Traceback (most recent call last): 2022-11-23T02:55:39.5896396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.5896572Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.5896954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5954, in _gpu_add_wrong_gpus 2022-11-23T02:55:39.5897063Z return x.cpu() + y.cuda() 2022-11-23T02:55:39.5897295Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-11-23T02:55:39.5897557Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-11-23T02:55:39.5898083Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f882f1e359b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5898853Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f882f1dedfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5899385Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f883b42a13d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5899846Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f883b42b67f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5900407Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f883b42cef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5900905Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f883b721b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5901268Z frame #6: + 0x2a0be3e (0x7f8832099e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5901850Z frame #7: + 0x2a0bf46 (0x7f8832099f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5902370Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f883c233c58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5902722Z frame #9: + 0x35efc70 (0x7f883daa5c70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5903073Z frame #10: + 0x35f03e9 (0x7f883daa63e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5903533Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f883c26de62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5904260Z frame #12: + 0x2ff562 (0x7f8847194562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5904648Z frame #13: + 0x2ff956 (0x7f8847194956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5904834Z frame #14: + 0x1ddc68 (0x55e572bcfc68 in /opt/conda/bin/python) 2022-11-23T02:55:39.5905012Z frame #15: + 0x199499 (0x55e572b8b499 in /opt/conda/bin/python) 2022-11-23T02:55:39.5905182Z frame #16: + 0x1995fa (0x55e572b8b5fa in /opt/conda/bin/python) 2022-11-23T02:55:39.5905356Z frame #17: PyNumber_Add + 0x41 (0x55e572b374b1 in /opt/conda/bin/python) 2022-11-23T02:55:39.5905551Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55e572bd4098 in /opt/conda/bin/python) 2022-11-23T02:55:39.5905728Z frame #19: + 0x18f742 (0x55e572b81742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5905902Z frame #20: _PyObject_Call + 0x20a (0x55e572b39faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5906107Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55e572bd5774 in /opt/conda/bin/python) 2022-11-23T02:55:39.5906282Z frame #22: + 0x18f742 (0x55e572b81742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5906458Z frame #23: _PyObject_Call + 0x20a (0x55e572b39faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5906840Z frame #24: + 0xaa8dba (0x7f884793ddba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5907340Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f884793bffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5908297Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f884793f2d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5909039Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f8847940b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5909803Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f883eee15cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5910561Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f884793f0c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5910993Z frame #30: + 0x4a24a53 (0x7f883eedaa53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5911790Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f883eedb5e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5912382Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f883eed58e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5912739Z frame #33: + 0x4a545d2 (0x7f883ef0a5d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5913128Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f882f1d190b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5913333Z frame #35: + 0xdbbf4 (0x7f885f037bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.5913630Z frame #36: + 0x76db (0x7f887f68b6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.5913877Z frame #37: clone + 0x3f (0x7f887f3b461f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.5913897Z 2022-11-23T02:55:39.5913914Z 2022-11-23T02:55:39.5914031Z On WorkerInfo(id=3, name=worker3): 2022-11-23T02:55:39.5925624Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f536770a59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f5367705dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f537395113d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f537395267f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f5373953ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f5373c48b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a0be3e (0x7f536a5c0e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: + 0x2a0bf46 (0x7f536a5c0f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f537475ac58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x35efc70 (0x7f5375fccc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x35f03e9 (0x7f5375fcd3e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f5374794e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff562 (0x7f537f6bb562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ff956 (0x7f537f6bb956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55cf42833c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55cf427ef499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55cf427ef5fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55cf4279b4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55cf42838098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55cf427e5742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55cf4279dfaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55cf42839774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55cf427e5742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55cf4279dfaa in /opt/conda/bin/python)\nframe #24: + 0xaa8dba (0x7f537fe64dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f537fe62ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f537fe662d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f537fe67b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f53774085cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f537fe660c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x4a24a53 (0x7f5377401a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f53774025e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f53773fc8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x4a545d2 (0x7f53774315d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f53676f890b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7f539755ebf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f53b7bb26db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f53b78db61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-11-23T02:55:39.5925826Z Traceback (most recent call last): 2022-11-23T02:55:39.5926164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.5926336Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.5926901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5954, in _gpu_add_wrong_gpus 2022-11-23T02:55:39.5927056Z return x.cpu() + y.cuda() 2022-11-23T02:55:39.5927298Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-11-23T02:55:39.5927570Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-11-23T02:55:39.5928110Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f536770a59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5928716Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f5367705dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5929193Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f537395113d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5929645Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f537395267f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5930206Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f5373953ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5930705Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f5373c48b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5931076Z frame #6: + 0x2a0be3e (0x7f536a5c0e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5931443Z frame #7: + 0x2a0bf46 (0x7f536a5c0f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5931987Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f537475ac58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5932354Z frame #9: + 0x35efc70 (0x7f5375fccc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5932874Z frame #10: + 0x35f03e9 (0x7f5375fcd3e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5933336Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f5374794e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5933693Z frame #12: + 0x2ff562 (0x7f537f6bb562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5934131Z frame #13: + 0x2ff956 (0x7f537f6bb956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5934318Z frame #14: + 0x1ddc68 (0x55cf42833c68 in /opt/conda/bin/python) 2022-11-23T02:55:39.5934492Z frame #15: + 0x199499 (0x55cf427ef499 in /opt/conda/bin/python) 2022-11-23T02:55:39.5934663Z frame #16: + 0x1995fa (0x55cf427ef5fa in /opt/conda/bin/python) 2022-11-23T02:55:39.5934831Z frame #17: PyNumber_Add + 0x41 (0x55cf4279b4b1 in /opt/conda/bin/python) 2022-11-23T02:55:39.5935015Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55cf42838098 in /opt/conda/bin/python) 2022-11-23T02:55:39.5935187Z frame #19: + 0x18f742 (0x55cf427e5742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5935359Z frame #20: _PyObject_Call + 0x20a (0x55cf4279dfaa in /opt/conda/bin/python) 2022-11-23T02:55:39.5935549Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55cf42839774 in /opt/conda/bin/python) 2022-11-23T02:55:39.5935762Z frame #22: + 0x18f742 (0x55cf427e5742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5935936Z frame #23: _PyObject_Call + 0x20a (0x55cf4279dfaa in /opt/conda/bin/python) 2022-11-23T02:55:39.5936303Z frame #24: + 0xaa8dba (0x7f537fe64dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5936793Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f537fe62ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5937410Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f537fe662d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5938056Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f537fe67b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5938780Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f53774085cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5939519Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f537fe660c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5939870Z frame #30: + 0x4a24a53 (0x7f5377401a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5940700Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f53774025e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5941315Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f53773fc8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5941678Z frame #33: + 0x4a545d2 (0x7f53774315d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5942078Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f53676f890b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5942285Z frame #35: + 0xdbbf4 (0x7f539755ebf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.5942636Z frame #36: + 0x76db (0x7f53b7bb26db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.5943063Z frame #37: clone + 0x3f (0x7f53b78db61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.5943083Z 2022-11-23T02:55:39.5943101Z 2022-11-23T02:55:39.5943219Z On WorkerInfo(id=2, name=worker2): 2022-11-23T02:55:39.5955868Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f12f9bba59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f12f9bb5dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f1305e0113d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f1305e0267f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f1305e03ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f13060f8b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a0be3e (0x7f12fca70e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #7: + 0x2a0bf46 (0x7f12fca70f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f1306c0ac58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x35efc70 (0x7f130847cc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x35f03e9 (0x7f130847d3e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f1306c44e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff562 (0x7f1311b6b562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ff956 (0x7f1311b6b956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55de521bec68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55de5217a499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55de5217a5fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55de521264b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55de521c3098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55de52170742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55de52128faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55de521c4774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55de52170742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55de52128faa in /opt/conda/bin/python)\nframe #24: + 0xaa8dba (0x7f1312314dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f1312312ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f13123162d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f1312317b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f13098b85cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f13123160c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x4a24a53 (0x7f13098b1a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f13098b25e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f13098ac8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x4a545d2 (0x7f13098e15d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f12f9ba890b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7f1329a0ebf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f134a0626db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f1349d8b61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-11-23T02:55:39.5956112Z Traceback (most recent call last): 2022-11-23T02:55:39.5956449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 207, in _run_function 2022-11-23T02:55:39.5956799Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-11-23T02:55:39.5957189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5954, in _gpu_add_wrong_gpus 2022-11-23T02:55:39.5957293Z return x.cpu() + y.cuda() 2022-11-23T02:55:39.5957529Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-11-23T02:55:39.5957807Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-11-23T02:55:39.5958353Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f12f9bba59b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5958961Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f12f9bb5dfe in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5959441Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xced (0x7f1305e0113d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5960038Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f1305e0267f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5960623Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f1305e03ef2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5961114Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f13060f8b7e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5961662Z frame #6: + 0x2a0be3e (0x7f12fca70e3e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5962027Z frame #7: + 0x2a0bf46 (0x7f12fca70f46 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) 2022-11-23T02:55:39.5962565Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f1306c0ac58 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5962989Z frame #9: + 0x35efc70 (0x7f130847cc70 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5963354Z frame #10: + 0x35f03e9 (0x7f130847d3e9 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5963856Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f1306c44e62 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5964250Z frame #12: + 0x2ff562 (0x7f1311b6b562 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5964774Z frame #13: + 0x2ff956 (0x7f1311b6b956 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5964953Z frame #14: + 0x1ddc68 (0x55de521bec68 in /opt/conda/bin/python) 2022-11-23T02:55:39.5965132Z frame #15: + 0x199499 (0x55de5217a499 in /opt/conda/bin/python) 2022-11-23T02:55:39.5965302Z frame #16: + 0x1995fa (0x55de5217a5fa in /opt/conda/bin/python) 2022-11-23T02:55:39.5965468Z frame #17: PyNumber_Add + 0x41 (0x55de521264b1 in /opt/conda/bin/python) 2022-11-23T02:55:39.5965658Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55de521c3098 in /opt/conda/bin/python) 2022-11-23T02:55:39.5965824Z frame #19: + 0x18f742 (0x55de52170742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5965996Z frame #20: _PyObject_Call + 0x20a (0x55de52128faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5966184Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55de521c4774 in /opt/conda/bin/python) 2022-11-23T02:55:39.5966352Z frame #22: + 0x18f742 (0x55de52170742 in /opt/conda/bin/python) 2022-11-23T02:55:39.5966519Z frame #23: _PyObject_Call + 0x20a (0x55de52128faa in /opt/conda/bin/python) 2022-11-23T02:55:39.5967057Z frame #24: + 0xaa8dba (0x7f1312314dba in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5967569Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f1312312ffd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5968256Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f13123162d5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5968925Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f1312317b16 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5969721Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f13098b85cc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5970652Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f13123160c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-11-23T02:55:39.5971009Z frame #30: + 0x4a24a53 (0x7f13098b1a53 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5971634Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f13098b25e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5972282Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f13098ac8e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5972636Z frame #33: + 0x4a545d2 (0x7f13098e15d2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-11-23T02:55:39.5973025Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f12f9ba890b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-11-23T02:55:39.5973224Z frame #35: + 0xdbbf4 (0x7f1329a0ebf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-11-23T02:55:39.5973693Z frame #36: + 0x76db (0x7f134a0626db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-11-23T02:55:39.5973948Z frame #37: clone + 0x3f (0x7f1349d8b61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-11-23T02:55:39.5973973Z 2022-11-23T02:55:39.5973991Z 2022-11-23T02:55:39.5974084Z ok (8.045s) 2022-11-23T02:55:39.5974103Z 2022-11-23T02:55:39.5974360Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5974460Z Ran 1 test in 8.045s 2022-11-23T02:55:39.5974478Z 2022-11-23T02:55:39.5974557Z OK 2022-11-23T02:55:39.5974576Z 2022-11-23T02:55:39.5974682Z Generating XML reports... 2022-11-23T02:55:39.5975220Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024856.xml 2022-11-23T02:55:39.5975582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5975748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5976117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5976462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5976699Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjq7ktnm4 2022-11-23T02:55:39.5976950Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjq7ktnm4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5976970Z 2022-11-23T02:55:39.5977063Z Running tests... 2022-11-23T02:55:39.5977302Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5977636Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5977921Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5978122Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108120 2022-11-23T02:55:39.5978323Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108121 2022-11-23T02:55:39.5978585Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108122 2022-11-23T02:55:39.5978788Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108123 2022-11-23T02:55:39.5979134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5979286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5979638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5979811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5980155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5980314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5980654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5980864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5981219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5981393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5981735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5981907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5982250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5982405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5982751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5982926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5983164Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1e7hmgb_ 2022-11-23T02:55:39.5983400Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphhaaml7r 2022-11-23T02:55:39.5983640Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1e7hmgb_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5984077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphhaaml7r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5984322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbux1x0vv 2022-11-23T02:55:39.5984565Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbux1x0vv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5984796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfv3a8dhc 2022-11-23T02:55:39.5985047Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfv3a8dhc/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5985255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5985644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5985860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5986067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5986209Z fi_getinfo: -61 2022-11-23T02:55:39.5986334Z fi_getinfo: -61 2022-11-23T02:55:39.5986455Z fi_getinfo: -61 2022-11-23T02:55:39.5986575Z fi_getinfo: -61 2022-11-23T02:55:39.5986662Z ok (4.718s) 2022-11-23T02:55:39.5986681Z 2022-11-23T02:55:39.5986937Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5987030Z Ran 1 test in 4.719s 2022-11-23T02:55:39.5987053Z 2022-11-23T02:55:39.5987134Z OK 2022-11-23T02:55:39.5987153Z 2022-11-23T02:55:39.5987332Z Generating XML reports... 2022-11-23T02:55:39.5987877Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024907.xml 2022-11-23T02:55:39.5988398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5988556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5989084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5989265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5989509Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpphgx_hg8 2022-11-23T02:55:39.5989757Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpphgx_hg8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5989835Z 2022-11-23T02:55:39.5989941Z Running tests... 2022-11-23T02:55:39.5990200Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.5990545Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.5990849Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.5991058Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108467 2022-11-23T02:55:39.5991264Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108468 2022-11-23T02:55:39.5991468Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108469 2022-11-23T02:55:39.5991660Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108470 2022-11-23T02:55:39.5992165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5992504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5992873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5993051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5993402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5993564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5993922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5994099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5994444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5994613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5994977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5995153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5995654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.5995810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.5996153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.5996321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.5996552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_g9ph7xw 2022-11-23T02:55:39.5996803Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_g9ph7xw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5997087Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnbe1vxb_ 2022-11-23T02:55:39.5997345Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnbe1vxb_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5997574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0d8xjlg0 2022-11-23T02:55:39.5997815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0d8xjlg0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5998043Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpejf06leq 2022-11-23T02:55:39.5998284Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpejf06leq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.5998493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.5998695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.5999125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.5999337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.5999474Z fi_getinfo: -61 2022-11-23T02:55:39.5999598Z fi_getinfo: -61 2022-11-23T02:55:39.5999721Z fi_getinfo: -61 2022-11-23T02:55:39.5999842Z fi_getinfo: -61 2022-11-23T02:55:39.5999923Z ok (4.837s) 2022-11-23T02:55:39.5999943Z 2022-11-23T02:55:39.6000196Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6000295Z Ran 1 test in 4.837s 2022-11-23T02:55:39.6000314Z 2022-11-23T02:55:39.6000392Z OK 2022-11-23T02:55:39.6000411Z 2022-11-23T02:55:39.6000523Z Generating XML reports... 2022-11-23T02:55:39.6001059Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024915.xml 2022-11-23T02:55:39.6001422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6001586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6002104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6002271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6002505Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcibwjgi9 2022-11-23T02:55:39.6002756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcibwjgi9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6002775Z 2022-11-23T02:55:39.6002865Z Running tests... 2022-11-23T02:55:39.6003107Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6003440Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6003747Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6003951Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108814 2022-11-23T02:55:39.6004142Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108815 2022-11-23T02:55:39.6004338Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108816 2022-11-23T02:55:39.6004533Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108817 2022-11-23T02:55:39.6004880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6005037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6005390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6005567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6005948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6006111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6006455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6006626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6006960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6007113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6007459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6007628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6008021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6008176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6008686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6008864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6009108Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplsknlab5 2022-11-23T02:55:39.6009365Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplsknlab5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6009607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptg62224l 2022-11-23T02:55:39.6009860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptg62224l/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6010107Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmey4im7r 2022-11-23T02:55:39.6010361Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmey4im7r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6010598Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgmwda3jf 2022-11-23T02:55:39.6010843Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgmwda3jf/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6011058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6011274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6011647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6011855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6011991Z fi_getinfo: -61 2022-11-23T02:55:39.6012078Z ok (10.850s) 2022-11-23T02:55:39.6012096Z 2022-11-23T02:55:39.6012347Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6012438Z Ran 1 test in 10.850s 2022-11-23T02:55:39.6012456Z 2022-11-23T02:55:39.6012533Z OK 2022-11-23T02:55:39.6012551Z 2022-11-23T02:55:39.6012661Z Generating XML reports... 2022-11-23T02:55:39.6013181Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024922.xml 2022-11-23T02:55:39.6013523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6013680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6014031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6014206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6014493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpphwcw4im 2022-11-23T02:55:39.6014744Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpphwcw4im/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6014763Z 2022-11-23T02:55:39.6014856Z Running tests... 2022-11-23T02:55:39.6015098Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6015434Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6015733Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6015935Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109189 2022-11-23T02:55:39.6016133Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109190 2022-11-23T02:55:39.6016328Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109191 2022-11-23T02:55:39.6016569Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109192 2022-11-23T02:55:39.6016916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6017073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6017414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6017571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6017923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6018095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6018443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6018621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6018953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6019109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6019452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6019621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6019963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6020118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6020459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6020632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6020867Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphl5rproj 2022-11-23T02:55:39.6021119Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphl5rproj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6021352Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp03q1hq1_ 2022-11-23T02:55:39.6021595Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp03q1hq1_/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6021823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpllq3j1u5 2022-11-23T02:55:39.6022068Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpllq3j1u5/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6022297Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaygq9j2t 2022-11-23T02:55:39.6022540Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaygq9j2t/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6022796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6023003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6023211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6023519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6023659Z fi_getinfo: -61 2022-11-23T02:55:39.6023743Z ok (12.838s) 2022-11-23T02:55:39.6023762Z 2022-11-23T02:55:39.6024193Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6024297Z Ran 1 test in 12.838s 2022-11-23T02:55:39.6024316Z 2022-11-23T02:55:39.6024394Z OK 2022-11-23T02:55:39.6024413Z 2022-11-23T02:55:39.6024514Z Generating XML reports... 2022-11-23T02:55:39.6025043Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024936.xml 2022-11-23T02:55:39.6025484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6025640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6025992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6026162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6026393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjw1o48i9 2022-11-23T02:55:39.6026637Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjw1o48i9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6026656Z 2022-11-23T02:55:39.6026748Z Running tests... 2022-11-23T02:55:39.6027169Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6027528Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6027839Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6028048Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109565 2022-11-23T02:55:39.6028253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109566 2022-11-23T02:55:39.6028457Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109567 2022-11-23T02:55:39.6028653Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109568 2022-11-23T02:55:39.6029013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6029170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6029541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6029723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6030074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6030237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6030588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6030750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6031111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6031289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6031641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6031878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6032242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6032404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6032760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6033094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6033328Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprw5u_4mi 2022-11-23T02:55:39.6033576Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprw5u_4mi/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6033805Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf9phq06i 2022-11-23T02:55:39.6034104Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf9phq06i/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6034333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0bfqmzw8 2022-11-23T02:55:39.6034578Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0bfqmzw8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6034805Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1h6lkm9a 2022-11-23T02:55:39.6035045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1h6lkm9a/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6035253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6035462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6035666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6035868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6036005Z fi_getinfo: -61 2022-11-23T02:55:39.6036090Z ok (12.946s) 2022-11-23T02:55:39.6036108Z 2022-11-23T02:55:39.6036352Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6036448Z Ran 1 test in 12.946s 2022-11-23T02:55:39.6036467Z 2022-11-23T02:55:39.6036544Z OK 2022-11-23T02:55:39.6036562Z 2022-11-23T02:55:39.6036670Z Generating XML reports... 2022-11-23T02:55:39.6037184Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024952.xml 2022-11-23T02:55:39.6037527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6037685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6038037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6038217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6038449Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyfo7yhnl 2022-11-23T02:55:39.6038697Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyfo7yhnl/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6038716Z 2022-11-23T02:55:39.6038807Z Running tests... 2022-11-23T02:55:39.6039051Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6039384Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6039678Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6039879Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109941 2022-11-23T02:55:39.6040078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109942 2022-11-23T02:55:39.6040497Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109943 2022-11-23T02:55:39.6040708Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109944 2022-11-23T02:55:39.6041067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6041231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6041600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6041772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6042123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6042282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6042697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6042876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6043382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6043537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6043878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6044047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6044381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6044536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6044885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6045064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6045299Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp33hq22fd 2022-11-23T02:55:39.6045532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyfitlpsh 2022-11-23T02:55:39.6045778Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp33hq22fd/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6046200Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyfitlpsh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6046432Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1jirjw99 2022-11-23T02:55:39.6046685Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1jirjw99/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6046922Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9bhb_l84 2022-11-23T02:55:39.6047180Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9bhb_l84/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6047395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6047610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6047825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6048037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6048173Z fi_getinfo: -61 2022-11-23T02:55:39.6048254Z ok (10.832s) 2022-11-23T02:55:39.6048274Z 2022-11-23T02:55:39.6048525Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6048625Z Ran 1 test in 10.832s 2022-11-23T02:55:39.6048644Z 2022-11-23T02:55:39.6048722Z OK 2022-11-23T02:55:39.6048745Z 2022-11-23T02:55:39.6048858Z Generating XML reports... 2022-11-23T02:55:39.6049436Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025008.xml 2022-11-23T02:55:39.6049804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6049967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6050326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6050506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6050748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgw9sig0c 2022-11-23T02:55:39.6051007Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgw9sig0c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6051026Z 2022-11-23T02:55:39.6051169Z Running tests... 2022-11-23T02:55:39.6051423Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6051768Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6052068Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6052279Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110316 2022-11-23T02:55:39.6052479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110317 2022-11-23T02:55:39.6052843Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110318 2022-11-23T02:55:39.6053205Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110319 2022-11-23T02:55:39.6053563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6053731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6054097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6054276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6054628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6054783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6055145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6055322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6055668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6055828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6056192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6056368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6056719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6056878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6057226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6057401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6057646Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprf24j8_f 2022-11-23T02:55:39.6057905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprf24j8_f/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6058196Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoax_jfd9 2022-11-23T02:55:39.6058460Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoax_jfd9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6058698Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5de8rcu9 2022-11-23T02:55:39.6058948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5de8rcu9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6059179Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbmq_deqq 2022-11-23T02:55:39.6059428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbmq_deqq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6059795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6060003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6060256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6060466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6060600Z fi_getinfo: -61 2022-11-23T02:55:39.6060720Z fi_getinfo: -61 2022-11-23T02:55:39.6060832Z fi_getinfo: -61 2022-11-23T02:55:39.6060951Z fi_getinfo: -61 2022-11-23T02:55:39.6061035Z ok (17.986s) 2022-11-23T02:55:39.6061054Z 2022-11-23T02:55:39.6061297Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6061392Z Ran 1 test in 17.986s 2022-11-23T02:55:39.6061410Z 2022-11-23T02:55:39.6061665Z OK 2022-11-23T02:55:39.6061684Z 2022-11-23T02:55:39.6061797Z Generating XML reports... 2022-11-23T02:55:39.6062333Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025021.xml 2022-11-23T02:55:39.6062686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6062860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6063230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6063408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6063652Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp32c0ez1k 2022-11-23T02:55:39.6064115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp32c0ez1k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6064138Z 2022-11-23T02:55:39.6064243Z Running tests... 2022-11-23T02:55:39.6064501Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6064848Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6065142Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6065358Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110823 2022-11-23T02:55:39.6065723Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110824 2022-11-23T02:55:39.6065922Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110825 2022-11-23T02:55:39.6066116Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110826 2022-11-23T02:55:39.6066464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6066624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6066983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6067150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6067740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6067960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6068331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6068507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6068855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6069018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6069375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6069549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6069893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6070158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6070517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6070691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6070932Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplw5pzan3 2022-11-23T02:55:39.6071189Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplw5pzan3/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6071431Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps42vguya 2022-11-23T02:55:39.6071688Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps42vguya/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6071921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprmksy1dw 2022-11-23T02:55:39.6072186Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprmksy1dw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6072429Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9pgfv40k 2022-11-23T02:55:39.6072678Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9pgfv40k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6072893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6073108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6073480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6073865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6074003Z fi_getinfo: -61 2022-11-23T02:55:39.6074121Z fi_getinfo: -61 2022-11-23T02:55:39.6074249Z fi_getinfo: -61 2022-11-23T02:55:39.6074370Z fi_getinfo: -61 2022-11-23T02:55:39.6074458Z ok (20.465s) 2022-11-23T02:55:39.6074481Z 2022-11-23T02:55:39.6074734Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6074839Z Ran 1 test in 20.465s 2022-11-23T02:55:39.6074858Z 2022-11-23T02:55:39.6074936Z OK 2022-11-23T02:55:39.6074956Z 2022-11-23T02:55:39.6075064Z Generating XML reports... 2022-11-23T02:55:39.6075595Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025042.xml 2022-11-23T02:55:39.6075953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6076117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6076483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6076663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6076952Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3vm4deud 2022-11-23T02:55:39.6077217Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3vm4deud/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6077237Z 2022-11-23T02:55:39.6077333Z Running tests... 2022-11-23T02:55:39.6077582Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6077927Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6078229Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6079108Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/81962 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.739s) 2022-11-23T02:55:39.6079174Z 2022-11-23T02:55:39.6079423Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6079519Z Ran 1 test in 1.739s 2022-11-23T02:55:39.6079537Z 2022-11-23T02:55:39.6079629Z OK (skipped=1) 2022-11-23T02:55:39.6079647Z 2022-11-23T02:55:39.6079755Z Generating XML reports... 2022-11-23T02:55:39.6080267Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025105.xml 2022-11-23T02:55:39.6080609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6080762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6081115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6081290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6081526Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa7ojnuly 2022-11-23T02:55:39.6081775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa7ojnuly/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6081794Z 2022-11-23T02:55:39.6081887Z Running tests... 2022-11-23T02:55:39.6082132Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6082464Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6082748Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6082950Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111402 2022-11-23T02:55:39.6083147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111403 2022-11-23T02:55:39.6083351Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111404 2022-11-23T02:55:39.6083546Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111405 2022-11-23T02:55:39.6083894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6084052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6084407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6084579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6084909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6085065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6085410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6085627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6086158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6086320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6086676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6086850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6087194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6087357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6087711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6087947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6088194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3lzflvm6 2022-11-23T02:55:39.6088452Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3lzflvm6/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6088850Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx22s1ai8 2022-11-23T02:55:39.6089096Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx22s1ai8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6089500Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvnxdv_j4 2022-11-23T02:55:39.6089746Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvnxdv_j4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6089982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6dm8jrtt 2022-11-23T02:55:39.6090242Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6dm8jrtt/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6090458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6090673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6090888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6091101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6091236Z fi_getinfo: -61 2022-11-23T02:55:39.6091354Z fi_getinfo: -61 2022-11-23T02:55:39.6091477Z fi_getinfo: -61 2022-11-23T02:55:39.6091599Z fi_getinfo: -61 2022-11-23T02:55:39.6091689Z ok (20.449s) 2022-11-23T02:55:39.6091709Z 2022-11-23T02:55:39.6091962Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6092063Z Ran 1 test in 20.450s 2022-11-23T02:55:39.6092087Z 2022-11-23T02:55:39.6092167Z OK 2022-11-23T02:55:39.6092186Z 2022-11-23T02:55:39.6092302Z Generating XML reports... 2022-11-23T02:55:39.6092833Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025110.xml 2022-11-23T02:55:39.6093190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6093354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6093722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6093902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6094142Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpah_tdxv9 2022-11-23T02:55:39.6094398Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpah_tdxv9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6094422Z 2022-11-23T02:55:39.6094520Z Running tests... 2022-11-23T02:55:39.6094812Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6095169Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6095467Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6095677Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111915 2022-11-23T02:55:39.6095882Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111916 2022-11-23T02:55:39.6096083Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111917 2022-11-23T02:55:39.6096284Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111918 2022-11-23T02:55:39.6096644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6096860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6097226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6097408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6097760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6097921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6098282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6098459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6099140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6099305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6099658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6099834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6100187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6100349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6100704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6100879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6101118Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_j_kh86z 2022-11-23T02:55:39.6101373Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_j_kh86z/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6101627Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpebyfuo4w 2022-11-23T02:55:39.6101877Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpebyfuo4w/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6102277Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp46fnn8ey 2022-11-23T02:55:39.6102522Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp46fnn8ey/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6102752Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbn5f8z2x 2022-11-23T02:55:39.6102993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbn5f8z2x/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6103202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6103410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6103662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6104057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6104199Z fi_getinfo: -61 2022-11-23T02:55:39.6104319Z fi_getinfo: -61 2022-11-23T02:55:39.6104439Z fi_getinfo: -61 2022-11-23T02:55:39.6104557Z fi_getinfo: -61 2022-11-23T02:55:39.6104641Z ok (18.147s) 2022-11-23T02:55:39.6104660Z 2022-11-23T02:55:39.6104903Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6105002Z Ran 1 test in 18.148s 2022-11-23T02:55:39.6105022Z 2022-11-23T02:55:39.6105091Z OK 2022-11-23T02:55:39.6105109Z 2022-11-23T02:55:39.6105218Z Generating XML reports... 2022-11-23T02:55:39.6105731Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025133.xml 2022-11-23T02:55:39.6106158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6106316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6106838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6107019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6107262Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaw1zvohm 2022-11-23T02:55:39.6107513Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaw1zvohm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6107540Z 2022-11-23T02:55:39.6107629Z Running tests... 2022-11-23T02:55:39.6107886Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6108228Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6108537Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6108747Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112422 2022-11-23T02:55:39.6108953Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112423 2022-11-23T02:55:39.6109156Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112424 2022-11-23T02:55:39.6109356Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112425 2022-11-23T02:55:39.6109710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6109874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6110238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6110420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6110776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6110937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6111291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6111455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6111810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6111988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6112350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6112527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6112938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6113108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6113471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6113648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6113892Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphwg93_lm 2022-11-23T02:55:39.6114140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphwg93_lm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6114381Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpixgam48e 2022-11-23T02:55:39.6114786Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpixgam48e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6115067Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6y6kzldv 2022-11-23T02:55:39.6115294Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1syordd 2022-11-23T02:55:39.6115537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6y6kzldv/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6115780Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1syordd/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6115986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6116194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6116396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6116600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6116731Z fi_getinfo: -61 2022-11-23T02:55:39.6116857Z fi_getinfo: -61 2022-11-23T02:55:39.6116975Z fi_getinfo: -61 2022-11-23T02:55:39.6117096Z fi_getinfo: -61 2022-11-23T02:55:39.6117180Z ok (16.636s) 2022-11-23T02:55:39.6117199Z 2022-11-23T02:55:39.6117437Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6117540Z Ran 1 test in 16.636s 2022-11-23T02:55:39.6117558Z 2022-11-23T02:55:39.6117635Z OK 2022-11-23T02:55:39.6117653Z 2022-11-23T02:55:39.6117762Z Generating XML reports... 2022-11-23T02:55:39.6118278Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025154.xml 2022-11-23T02:55:39.6118804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6118968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6119333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6119512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6119753Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0mi7wjl7 2022-11-23T02:55:39.6120009Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0mi7wjl7/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6120029Z 2022-11-23T02:55:39.6120123Z Running tests... 2022-11-23T02:55:39.6120376Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6120725Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6121028Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6121235Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112928 2022-11-23T02:55:39.6121439Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112929 2022-11-23T02:55:39.6121686Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112930 2022-11-23T02:55:39.6122054Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112931 2022-11-23T02:55:39.6122407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6122564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6122917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6123089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6123426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6123582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6124166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6124338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6124687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6124850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6125207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6125382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6125732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6125890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6126242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6126417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6126658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxqsq_1qq 2022-11-23T02:55:39.6126915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxqsq_1qq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6127158Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprmlr5lc2 2022-11-23T02:55:39.6127411Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprmlr5lc2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6127651Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzbaaaja0 2022-11-23T02:55:39.6127904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzbaaaja0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6128141Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph5ob87nq 2022-11-23T02:55:39.6128401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph5ob87nq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6128612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6128828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6129040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6129250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6129382Z fi_getinfo: -61 2022-11-23T02:55:39.6129506Z fi_getinfo: -61 2022-11-23T02:55:39.6129629Z fi_getinfo: -61 2022-11-23T02:55:39.6129746Z fi_getinfo: -61 2022-11-23T02:55:39.6129833Z ok (17.036s) 2022-11-23T02:55:39.6129853Z 2022-11-23T02:55:39.6130104Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6130208Z Ran 1 test in 17.037s 2022-11-23T02:55:39.6130227Z 2022-11-23T02:55:39.6130305Z OK 2022-11-23T02:55:39.6130370Z 2022-11-23T02:55:39.6130489Z Generating XML reports... 2022-11-23T02:55:39.6131026Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025214.xml 2022-11-23T02:55:39.6131381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6131544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6131903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6132081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6132321Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn7_vxwm4 2022-11-23T02:55:39.6132574Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn7_vxwm4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6132638Z 2022-11-23T02:55:39.6132742Z Running tests... 2022-11-23T02:55:39.6132993Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6133337Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6133638Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6133841Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113437 2022-11-23T02:55:39.6134049Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113438 2022-11-23T02:55:39.6134251Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 113439 2022-11-23T02:55:39.6134454Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 113440 2022-11-23T02:55:39.6134810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6134983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6135349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6135531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6135886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6136042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6136402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6136579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6136926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6137097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6137453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6137627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6137982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6138134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6138489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6138663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6138901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_b7xutfg 2022-11-23T02:55:39.6139204Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_b7xutfg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6139453Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp84qutlj0 2022-11-23T02:55:39.6139708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp84qutlj0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6139947Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1zpacm34 2022-11-23T02:55:39.6140199Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1zpacm34/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6140430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3dovpkn8 2022-11-23T02:55:39.6141016Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3dovpkn8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6141234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6141495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6141714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6141927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6142062Z fi_getinfo: -61 2022-11-23T02:55:39.6142187Z fi_getinfo: -61 2022-11-23T02:55:39.6142303Z fi_getinfo: -61 2022-11-23T02:55:39.6142424Z fi_getinfo: -61 2022-11-23T02:55:39.6142511Z ok (16.839s) 2022-11-23T02:55:39.6142531Z 2022-11-23T02:55:39.6142783Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6142885Z Ran 1 test in 16.839s 2022-11-23T02:55:39.6142905Z 2022-11-23T02:55:39.6142983Z OK 2022-11-23T02:55:39.6143002Z 2022-11-23T02:55:39.6143116Z Generating XML reports... 2022-11-23T02:55:39.6143809Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025234.xml 2022-11-23T02:55:39.6144377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6144539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6144894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6145065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6145341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp44rtexoa 2022-11-23T02:55:39.6145806Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp44rtexoa/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6145834Z 2022-11-23T02:55:39.6146010Z Running tests... 2022-11-23T02:55:39.6146639Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6147266Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6147837Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6148233Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113946 2022-11-23T02:55:39.6148593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113947 2022-11-23T02:55:39.6148952Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 113948 2022-11-23T02:55:39.6149290Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 113949 2022-11-23T02:55:39.6149813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6149970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6150513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6150771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6151138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6151301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6151665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6151841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6152196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6152357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6152718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6152974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6153479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6153633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6153976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6154145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6154382Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0bf4y7f4 2022-11-23T02:55:39.6154629Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0bf4y7f4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6154863Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzkapv85 2022-11-23T02:55:39.6155110Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzkapv85/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6155343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjhwqbd5r 2022-11-23T02:55:39.6155590Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjhwqbd5r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6155822Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpku3wpmcj 2022-11-23T02:55:39.6156065Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpku3wpmcj/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6156274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6156482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6156688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6156893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6157197Z fi_getinfo: -61 2022-11-23T02:55:39.6157327Z fi_getinfo: -61 2022-11-23T02:55:39.6157455Z fi_getinfo: -61 2022-11-23T02:55:39.6157577Z fi_getinfo: -61 2022-11-23T02:55:39.6157664Z ok (16.555s) 2022-11-23T02:55:39.6157683Z 2022-11-23T02:55:39.6157938Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6158040Z Ran 1 test in 16.555s 2022-11-23T02:55:39.6158059Z 2022-11-23T02:55:39.6158133Z OK 2022-11-23T02:55:39.6158158Z 2022-11-23T02:55:39.6158264Z Generating XML reports... 2022-11-23T02:55:39.6158799Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025253.xml 2022-11-23T02:55:39.6159155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6159318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6159685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6160076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6160317Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2j8xby74 2022-11-23T02:55:39.6160563Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2j8xby74/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6160583Z 2022-11-23T02:55:39.6160668Z Running tests... 2022-11-23T02:55:39.6160916Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6161251Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6161543Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6161744Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114452 2022-11-23T02:55:39.6161991Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114453 2022-11-23T02:55:39.6162368Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114454 2022-11-23T02:55:39.6162567Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114455 2022-11-23T02:55:39.6162925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6163083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6163449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6163627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6164027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6164194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6164563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6164743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6165090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6165407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6165752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6165920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6166260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6166415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6166762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6166930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6167166Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphyjdf8vb 2022-11-23T02:55:39.6167419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphyjdf8vb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6167644Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxhelnuw1 2022-11-23T02:55:39.6168129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxhelnuw1/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6168373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpovgd74o9 2022-11-23T02:55:39.6168628Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpovgd74o9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6168868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4rl0sd24 2022-11-23T02:55:39.6169166Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4rl0sd24/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6169392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6169608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6169815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6170028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6170168Z fi_getinfo: -61 2022-11-23T02:55:39.6170293Z fi_getinfo: -61 2022-11-23T02:55:39.6170414Z fi_getinfo: -61 2022-11-23T02:55:39.6170534Z fi_getinfo: -61 2022-11-23T02:55:39.6170620Z ok (17.827s) 2022-11-23T02:55:39.6170639Z 2022-11-23T02:55:39.6170889Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6171034Z Ran 1 test in 17.828s 2022-11-23T02:55:39.6171057Z 2022-11-23T02:55:39.6171139Z OK 2022-11-23T02:55:39.6171159Z 2022-11-23T02:55:39.6171271Z Generating XML reports... 2022-11-23T02:55:39.6171808Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025313.xml 2022-11-23T02:55:39.6172165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6172328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6172691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6172868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6173100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7obf899f 2022-11-23T02:55:39.6173361Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7obf899f/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6173382Z 2022-11-23T02:55:39.6173479Z Running tests... 2022-11-23T02:55:39.6173733Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6174398Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6174702Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6174911Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114959 2022-11-23T02:55:39.6175118Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114960 2022-11-23T02:55:39.6175321Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114961 2022-11-23T02:55:39.6175516Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114962 2022-11-23T02:55:39.6175883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6176049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6176417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6176597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6176946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6177268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6177619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6177785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6178171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6178331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6178679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6178849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6179191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6179346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6179690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6179859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6180088Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp044ilshh 2022-11-23T02:55:39.6180387Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp044ilshh/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6180620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxf1pezn0 2022-11-23T02:55:39.6180867Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxf1pezn0/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6181097Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7t13cktp 2022-11-23T02:55:39.6181342Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7t13cktp/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6181573Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgbgd9jdm 2022-11-23T02:55:39.6181818Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgbgd9jdm/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6182020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6182235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6182440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6182646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6182778Z fi_getinfo: -61 2022-11-23T02:55:39.6182897Z fi_getinfo: -61 2022-11-23T02:55:39.6183014Z fi_getinfo: -61 2022-11-23T02:55:39.6183132Z fi_getinfo: -61 2022-11-23T02:55:39.6183210Z ok (20.451s) 2022-11-23T02:55:39.6183229Z 2022-11-23T02:55:39.6183475Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6183572Z Ran 1 test in 20.451s 2022-11-23T02:55:39.6183591Z 2022-11-23T02:55:39.6183666Z OK 2022-11-23T02:55:39.6183685Z 2022-11-23T02:55:39.6183792Z Generating XML reports... 2022-11-23T02:55:39.6184530Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025333.xml 2022-11-23T02:55:39.6184888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6185046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6185395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6185569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6185802Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpewbzvz7l 2022-11-23T02:55:39.6186051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpewbzvz7l/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6186245Z 2022-11-23T02:55:39.6186342Z Running tests... 2022-11-23T02:55:39.6186596Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6187014Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6187329Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6187538Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115472 2022-11-23T02:55:39.6187739Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115473 2022-11-23T02:55:39.6187940Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 115474 2022-11-23T02:55:39.6188140Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 115475 2022-11-23T02:55:39.6188501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6188662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6189188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6189426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6189955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6190108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6190473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6190651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6190999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6191158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6191513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6191694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6192045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6192205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6192707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6192876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6193289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxez3bw2q 2022-11-23T02:55:39.6193549Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxez3bw2q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6193793Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptvk9iv7v 2022-11-23T02:55:39.6194052Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptvk9iv7v/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6194298Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnp2gturu 2022-11-23T02:55:39.6194553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnp2gturu/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6194782Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3uk30z9d 2022-11-23T02:55:39.6195031Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3uk30z9d/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6195246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6195463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6195675Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6195887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6196181Z fi_getinfo: -61 2022-11-23T02:55:39.6196346Z fi_getinfo: -61 2022-11-23T02:55:39.6196468Z fi_getinfo: -61 2022-11-23T02:55:39.6196588Z fi_getinfo: -61 2022-11-23T02:55:39.6196673Z ok (18.035s) 2022-11-23T02:55:39.6196692Z 2022-11-23T02:55:39.6196934Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6197033Z Ran 1 test in 18.035s 2022-11-23T02:55:39.6197052Z 2022-11-23T02:55:39.6197127Z OK 2022-11-23T02:55:39.6197145Z 2022-11-23T02:55:39.6197253Z Generating XML reports... 2022-11-23T02:55:39.6197769Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025357.xml 2022-11-23T02:55:39.6198110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6198269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6198678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6198853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6199088Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpth6yltju 2022-11-23T02:55:39.6199335Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpth6yltju/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6199354Z 2022-11-23T02:55:39.6199448Z Running tests... 2022-11-23T02:55:39.6199876Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6200227Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6200524Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6200733Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115979 2022-11-23T02:55:39.6200947Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115980 2022-11-23T02:55:39.6201153Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 115981 2022-11-23T02:55:39.6201352Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 115982 2022-11-23T02:55:39.6201711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6201874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6202239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6202412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6202914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6203072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6203420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6203593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6203926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6204084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6204435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6204604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6204930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6205084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6205479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6205656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6205893Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplvepx68c 2022-11-23T02:55:39.6206142Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplvepx68c/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6206372Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2mk1agrr 2022-11-23T02:55:39.6206616Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2mk1agrr/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6206840Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzhlnh23r 2022-11-23T02:55:39.6207084Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzhlnh23r/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6207364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpie3dq88f 2022-11-23T02:55:39.6207608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpie3dq88f/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6207818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6208024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6208229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6208433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6208565Z fi_getinfo: -61 2022-11-23T02:55:39.6208678Z fi_getinfo: -61 2022-11-23T02:55:39.6208796Z fi_getinfo: -61 2022-11-23T02:55:39.6208914Z fi_getinfo: -61 2022-11-23T02:55:39.6209167Z ok (20.596s) 2022-11-23T02:55:39.6209187Z 2022-11-23T02:55:39.6209446Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6209551Z Ran 1 test in 20.596s 2022-11-23T02:55:39.6209571Z 2022-11-23T02:55:39.6209651Z OK 2022-11-23T02:55:39.6209670Z 2022-11-23T02:55:39.6209777Z Generating XML reports... 2022-11-23T02:55:39.6210310Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025418.xml 2022-11-23T02:55:39.6210670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6210831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6211195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6211372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6211617Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpey5ldw_k 2022-11-23T02:55:39.6211885Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpey5ldw_k/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6211905Z 2022-11-23T02:55:39.6212001Z Running tests... 2022-11-23T02:55:39.6212406Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6212742Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6213042Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6213243Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 116492 2022-11-23T02:55:39.6213445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 116493 2022-11-23T02:55:39.6213641Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 116494 2022-11-23T02:55:39.6213837Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 116495 2022-11-23T02:55:39.6214267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6214425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6214782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6214952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6215293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6215448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6215798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6215967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6216372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6216530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6216872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6217041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6217380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6217533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6217877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6218110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6218399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptcmmyd4e 2022-11-23T02:55:39.6218701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptcmmyd4e/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6218929Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpck_xzu85 2022-11-23T02:55:39.6219333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpck_xzu85/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6219616Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe6iv5w3m 2022-11-23T02:55:39.6219906Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe6iv5w3m/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6220132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6f2he2w2 2022-11-23T02:55:39.6220430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6f2he2w2/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6220689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6220947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6221200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6221380Z fi_getinfo: -61 2022-11-23T02:55:39.6221583Z fi_getinfo: -61 2022-11-23T02:55:39.6221834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6221953Z fi_getinfo: -61 2022-11-23T02:55:39.6222126Z fi_getinfo: -61 2022-11-23T02:55:39.6222290Z ok (8.053s) 2022-11-23T02:55:39.6222309Z 2022-11-23T02:55:39.6222601Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6222742Z Ran 1 test in 8.054s 2022-11-23T02:55:39.6222761Z 2022-11-23T02:55:39.6222881Z OK 2022-11-23T02:55:39.6222900Z 2022-11-23T02:55:39.6223054Z Generating XML reports... 2022-11-23T02:55:39.6223718Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025441.xml 2022-11-23T02:55:39.6224285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6224501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6224906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6225124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6225401Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6lqmkrsg 2022-11-23T02:55:39.6225696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6lqmkrsg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6225715Z 2022-11-23T02:55:39.6225852Z Running tests... 2022-11-23T02:55:39.6226156Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6226568Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6226954Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6227240Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 117003 2022-11-23T02:55:39.6227657Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 117004 2022-11-23T02:55:39.6227910Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 117005 2022-11-23T02:55:39.6228158Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 117006 2022-11-23T02:55:39.6228572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6228793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6229221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6229395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6229832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6230044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6230452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6230681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6231080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6231277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6245043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6245253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6245640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6245803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6246159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6246334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6246574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm8eg9u1q 2022-11-23T02:55:39.6247012Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm8eg9u1q/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6247253Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe19u61rf 2022-11-23T02:55:39.6247630Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe19u61rf/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6247884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmo06ec9l 2022-11-23T02:55:39.6248120Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2juk6t5y 2022-11-23T02:55:39.6248370Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmo06ec9l/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6248621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2juk6t5y/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6248835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6249051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6249257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6249528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6249829Z fi_getinfo: -61 2022-11-23T02:55:39.6249948Z fi_getinfo: -61 2022-11-23T02:55:39.6250065Z fi_getinfo: -61 2022-11-23T02:55:39.6250182Z fi_getinfo: -61 2022-11-23T02:55:39.6250265Z ok (10.144s) 2022-11-23T02:55:39.6250287Z 2022-11-23T02:55:39.6250714Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6250813Z Ran 1 test in 10.144s 2022-11-23T02:55:39.6250832Z 2022-11-23T02:55:39.6250912Z OK 2022-11-23T02:55:39.6250931Z 2022-11-23T02:55:39.6251044Z Generating XML reports... 2022-11-23T02:55:39.6251581Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025452.xml 2022-11-23T02:55:39.6251946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6252114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6252486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6252666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6252903Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpblj33zhe 2022-11-23T02:55:39.6253161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpblj33zhe/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6253182Z 2022-11-23T02:55:39.6253279Z Running tests... 2022-11-23T02:55:39.6253690Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6254025Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6254326Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6254534Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118130 2022-11-23T02:55:39.6254737Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118131 2022-11-23T02:55:39.6254934Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118132 2022-11-23T02:55:39.6255120Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118133 2022-11-23T02:55:39.6255467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6255625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6255978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6256152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6256488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6256695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6257053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6257219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6257736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6257895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6258251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6258426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6258777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6258987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6259349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6259525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6259766Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsto8tdpw 2022-11-23T02:55:39.6260033Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsto8tdpw/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6260275Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbjh4swl4 2022-11-23T02:55:39.6260681Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbjh4swl4/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6260910Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1xm0odp9 2022-11-23T02:55:39.6261154Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1xm0odp9/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6261388Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoui9vzaq 2022-11-23T02:55:39.6261631Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoui9vzaq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6261834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6262043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6262248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6262628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6262764Z fi_getinfo: -61 2022-11-23T02:55:39.6262888Z fi_getinfo: -61 2022-11-23T02:55:39.6263011Z fi_getinfo: -61 2022-11-23T02:55:39.6263133Z fi_getinfo: -61 2022-11-23T02:55:39.6263218Z ok (8.131s) 2022-11-23T02:55:39.6263238Z 2022-11-23T02:55:39.6263499Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6263600Z Ran 1 test in 8.131s 2022-11-23T02:55:39.6263619Z 2022-11-23T02:55:39.6263698Z OK 2022-11-23T02:55:39.6263717Z 2022-11-23T02:55:39.6263830Z Generating XML reports... 2022-11-23T02:55:39.6264699Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025505.xml 2022-11-23T02:55:39.6265060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6265223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6265580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6265762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6266082Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2u44s4yg 2022-11-23T02:55:39.6266349Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2u44s4yg/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6266369Z 2022-11-23T02:55:39.6266465Z Running tests... 2022-11-23T02:55:39.6266720Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6267063Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6267532Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6267736Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118789 2022-11-23T02:55:39.6267929Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118790 2022-11-23T02:55:39.6268353Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118791 2022-11-23T02:55:39.6268626Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118792 2022-11-23T02:55:39.6268991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6269154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6269519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6269700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6270050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6270203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6270561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6270743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6271097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6271259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6271781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6271954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6272288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6272444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6272782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6272951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6273196Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfdc97gjo 2022-11-23T02:55:39.6273447Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfdc97gjo/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6273680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplz__kuhq 2022-11-23T02:55:39.6273924Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplz__kuhq/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6274159Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppx596w1j 2022-11-23T02:55:39.6274402Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppx596w1j/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6274798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc4nalnrz 2022-11-23T02:55:39.6275051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc4nalnrz/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6275324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6275546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6275759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6275970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6276105Z fi_getinfo: -61 2022-11-23T02:55:39.6276229Z fi_getinfo: -61 2022-11-23T02:55:39.6276346Z fi_getinfo: -61 2022-11-23T02:55:39.6276468Z fi_getinfo: -61 2022-11-23T02:55:39.6276556Z ok (9.148s) 2022-11-23T02:55:39.6276575Z 2022-11-23T02:55:39.6276827Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6276928Z Ran 1 test in 9.148s 2022-11-23T02:55:39.6276948Z 2022-11-23T02:55:39.6277028Z OK 2022-11-23T02:55:39.6277047Z 2022-11-23T02:55:39.6277158Z Generating XML reports... 2022-11-23T02:55:39.6277960Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025516.xml 2022-11-23T02:55:39.6278303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6278462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6278813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6278985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6279217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6nn09il8 2022-11-23T02:55:39.6279462Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6nn09il8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6279482Z 2022-11-23T02:55:39.6279574Z Running tests... 2022-11-23T02:55:39.6279819Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6280159Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-11-23T02:55:39.6280459Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:55:39.6280661Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 119448 2022-11-23T02:55:39.6280859Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 119449 2022-11-23T02:55:39.6281057Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 119450 2022-11-23T02:55:39.6281249Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 119451 2022-11-23T02:55:39.6281594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6281753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6282111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6282279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6282618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6282773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6283120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6283291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6283627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6283782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6284175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6284353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6284688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:55:39.6284841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:55:39.6285183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:55:39.6285351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:55:39.6285585Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyl3__68x 2022-11-23T02:55:39.6285830Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyl3__68x/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6286107Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_b4hzz9w 2022-11-23T02:55:39.6286356Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_b4hzz9w/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6286767Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppxz5agd8 2022-11-23T02:55:39.6287022Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppxz5agd8/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6287256Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptp98kumb 2022-11-23T02:55:39.6287508Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptp98kumb/_remote_module_non_scriptable.py 2022-11-23T02:55:39.6287726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:55:39.6287939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:55:39.6288153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:55:39.6288371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:55:39.6288507Z fi_getinfo: -61 2022-11-23T02:55:39.6288624Z fi_getinfo: -61 2022-11-23T02:55:39.6288746Z fi_getinfo: -61 2022-11-23T02:55:39.6288867Z fi_getinfo: -61 2022-11-23T02:55:39.6288954Z ok (10.326s) 2022-11-23T02:55:39.6288974Z 2022-11-23T02:55:39.6289226Z ---------------------------------------------------------------------- 2022-11-23T02:55:39.6289487Z Ran 1 test in 10.326s 2022-11-23T02:55:39.6289506Z 2022-11-23T02:55:39.6289582Z OK 2022-11-23T02:55:39.6289601Z 2022-11-23T02:55:39.6289705Z Generating XML reports... 2022-11-23T02:55:39.6290421Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025528.xml 2022-11-23T02:55:39.6290440Z 2022-11-23T02:55:39.6290806Z ##[endgroup] 2022-11-23T02:55:39.6291309Z FINISHED PRINTING LOG FILE of distributed/rpc/cuda/test_tensorpipe_agent (/var/lib/jenkins/workspace/test/test-reports/distributed-rpc-cuda-test_tensorpipe_agent_f7xxdyjg) 2022-11-23T02:55:39.6291333Z 2022-11-23T02:55:39.8412423Z 2022-11-23T02:55:39.8412860Z real 22m43.443s 2022-11-23T02:55:39.8413155Z user 53m29.726s 2022-11-23T02:55:39.8413484Z sys 42m38.651s 2022-11-23T02:55:39.8413825Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:55:39.8414393Z + python test/run_test.py --verbose -i distributed/fsdp/test_checkpoint_wrapper.py 2022-11-23T02:55:42.2002030Z Ignoring disabled issues: [] 2022-11-23T02:55:42.2523311Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:55:42.2523914Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:55:42.2524259Z Selected tests: 2022-11-23T02:55:42.2524574Z distributed/fsdp/test_checkpoint_wrapper.py 2022-11-23T02:55:42.2547196Z Prioritized test from test file changes. 2022-11-23T02:55:42.2547516Z reordering tests for PR: 2022-11-23T02:55:42.2548025Z prioritized: [] 2022-11-23T02:55:42.2548570Z the rest: ['distributed/fsdp/test_checkpoint_wrapper.py'] 2022-11-23T02:55:42.2548809Z 2022-11-23T02:55:42.2549346Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:55:42.2550262Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:55:42.2556855Z parallel (file granularity) tests: 2022-11-23T02:55:42.2557505Z 2022-11-23T02:55:42.2557939Z serial (file granularity) tests: 2022-11-23T02:55:42.2558375Z distributed/fsdp/test_checkpoint_wrapper.py 2022-11-23T02:55:44.5250521Z Ignoring disabled issues: [] 2022-11-23T02:55:44.9491202Z Running distributed/fsdp/test_checkpoint_wrapper.py ... [2022-11-23 02:55:44.948405] 2022-11-23T02:55:44.9492373Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:55:44.948845] 2022-11-23T02:55:49.6436108Z 2022-11-23T02:55:49.6436984Z Expand the folded group to see the log file of distributed/fsdp/test_checkpoint_wrapper 2022-11-23T02:55:49.6438890Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_checkpoint_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_checkpoint_wrapper_a2i2bsfo) 2022-11-23T02:55:49.6439579Z 2022-11-23T02:55:49.6439804Z Running tests... 2022-11-23T02:55:49.6440758Z ---------------------------------------------------------------------- 2022-11-23T02:55:49.6441359Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper 2022-11-23T02:55:49.6441850Z test_apply_activation_checkpointing (__main__.CheckpointWrapperTest) 2022-11-23T02:55:49.6442282Z Ensures that `apply_activation_checkpointing` can be used ... ok (1.840s) 2022-11-23T02:55:49.6442732Z test_checkpoint_wrapper_cpu_offload (__main__.CheckpointWrapperTest) ... ok (0.434s) 2022-11-23T02:55:49.6443364Z test_checkpoint_wrapper_kwarg_support (__main__.CheckpointWrapperTest) ... ok (0.010s) 2022-11-23T02:55:49.6443964Z test_checkpoint_wrapper_parity (__main__.CheckpointWrapperTest) 2022-11-23T02:55:49.6445083Z Tests that using checkpoint_wrapper or the functional ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79510 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-11-23T02:55:49.6445880Z test_forward_missing_attributes (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-11-23T02:55:49.6446288Z test_fqn (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-11-23T02:55:49.6446720Z test_load_activation_checkpointed_module (__main__.CheckpointWrapperTest) ... ok (0.003s) 2022-11-23T02:55:49.6446992Z 2022-11-23T02:55:49.6447248Z ---------------------------------------------------------------------- 2022-11-23T02:55:49.6447585Z Ran 7 tests in 2.292s 2022-11-23T02:55:49.6447750Z 2022-11-23T02:55:49.6447876Z OK (skipped=1) 2022-11-23T02:55:49.6448030Z 2022-11-23T02:55:49.6448154Z Generating XML reports... 2022-11-23T02:55:49.6448772Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20221123025546.xml 2022-11-23T02:55:49.6449152Z 2022-11-23T02:55:49.6451615Z ##[endgroup] 2022-11-23T02:55:49.6452296Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_checkpoint_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_checkpoint_wrapper_a2i2bsfo) 2022-11-23T02:55:49.6452678Z 2022-11-23T02:55:50.0023428Z 2022-11-23T02:55:50.0023861Z real 0m10.161s 2022-11-23T02:55:50.0024810Z user 0m15.961s 2022-11-23T02:55:50.0025362Z sys 0m11.868s 2022-11-23T02:55:50.0025879Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:55:50.0026578Z + python test/run_test.py --verbose -i distributed/fsdp/test_distributed_checkpoint.py 2022-11-23T02:55:52.4004747Z Ignoring disabled issues: [] 2022-11-23T02:55:52.4534188Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:55:52.4534651Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:55:52.4535112Z Selected tests: 2022-11-23T02:55:52.4535399Z distributed/fsdp/test_distributed_checkpoint.py 2022-11-23T02:55:52.4559093Z Prioritized test from test file changes. 2022-11-23T02:55:52.4559432Z reordering tests for PR: 2022-11-23T02:55:52.4559714Z prioritized: [] 2022-11-23T02:55:52.4560380Z the rest: ['distributed/fsdp/test_distributed_checkpoint.py'] 2022-11-23T02:55:52.4560602Z 2022-11-23T02:55:52.4561080Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:55:52.4562399Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:55:52.4566013Z parallel (file granularity) tests: 2022-11-23T02:55:52.4566376Z 2022-11-23T02:55:52.4566628Z serial (file granularity) tests: 2022-11-23T02:55:52.4566979Z distributed/fsdp/test_distributed_checkpoint.py 2022-11-23T02:55:54.6817633Z Ignoring disabled issues: [] 2022-11-23T02:55:55.0998725Z Running distributed/fsdp/test_distributed_checkpoint.py ... [2022-11-23 02:55:55.099125] 2022-11-23T02:55:55.0999966Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_distributed_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:55:55.099541] 2022-11-23T02:56:07.5559198Z 2022-11-23T02:56:07.5560640Z Expand the folded group to see the log file of distributed/fsdp/test_distributed_checkpoint 2022-11-23T02:56:07.5562760Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_distributed_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_distributed_checkpoint_6mqvqv4m) 2022-11-23T02:56:07.5563394Z 2022-11-23T02:56:07.5563564Z Running tests... 2022-11-23T02:56:07.5564167Z ---------------------------------------------------------------------- 2022-11-23T02:56:07.5564749Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint 2022-11-23T02:56:07.5565271Z test_distributed_checkpoint_state_dict_type_StateDictType_LOCAL_STATE_DICT (__main__.TestDistributedCheckpoint) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:56:07.5566005Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 120498 2022-11-23T02:56:07.5566441Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 120499 2022-11-23T02:56:07.5567126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:07.5567498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:07.5568155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:07.5568519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:07.5569062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:07.5569543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:07.5570052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:07.5570718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:07.5571111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:56:07.5571966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:56:07.5572673Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:07.5573350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:07.5573883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:56:07.5574364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:56:07.5574702Z dist init r=0, world=2 2022-11-23T02:56:07.5574964Z dist init r=1, world=2 2022-11-23T02:56:07.5575272Z ok (5.913s) 2022-11-23T02:56:07.5575743Z test_distributed_checkpoint_state_dict_type_StateDictType_SHARDED_STATE_DICT (__main__.TestDistributedCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 120645 2022-11-23T02:56:07.5576513Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 120646 2022-11-23T02:56:07.5577238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:07.5577762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:07.5578236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:07.5578807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:07.5579275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:07.5579714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:07.5580257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:07.5580719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:07.5581232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:56:07.5581654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:56:07.5582356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:07.5583154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:07.5583714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:56:07.5584489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:56:07.5584917Z dist init r=1, world=2 2022-11-23T02:56:07.5585116Z dist init r=0, world=2 2022-11-23T02:56:07.5585530Z ok (4.114s) 2022-11-23T02:56:07.5585661Z 2022-11-23T02:56:07.5586036Z ---------------------------------------------------------------------- 2022-11-23T02:56:07.5586464Z Ran 2 tests in 10.028s 2022-11-23T02:56:07.5586723Z 2022-11-23T02:56:07.5586826Z OK 2022-11-23T02:56:07.5587064Z 2022-11-23T02:56:07.5587150Z Generating XML reports... 2022-11-23T02:56:07.5587747Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint/TEST-TestDistributedCheckpoint-20221123025557.xml 2022-11-23T02:56:07.5588241Z 2022-11-23T02:56:07.5588482Z ##[endgroup] 2022-11-23T02:56:07.5589118Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_distributed_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_distributed_checkpoint_6mqvqv4m) 2022-11-23T02:56:07.5589843Z 2022-11-23T02:56:07.9347774Z 2022-11-23T02:56:07.9348589Z real 0m17.932s 2022-11-23T02:56:07.9348772Z user 0m32.407s 2022-11-23T02:56:07.9349107Z sys 0m26.579s 2022-11-23T02:56:07.9349514Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:56:07.9350061Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_apply.py 2022-11-23T02:56:10.3016297Z Ignoring disabled issues: [] 2022-11-23T02:56:10.3539072Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:56:10.3539588Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:56:10.3539967Z Selected tests: 2022-11-23T02:56:10.3540272Z distributed/fsdp/test_fsdp_apply.py 2022-11-23T02:56:10.3564310Z Prioritized test from test file changes. 2022-11-23T02:56:10.3564668Z reordering tests for PR: 2022-11-23T02:56:10.3564960Z prioritized: [] 2022-11-23T02:56:10.3565475Z the rest: ['distributed/fsdp/test_fsdp_apply.py'] 2022-11-23T02:56:10.3565794Z 2022-11-23T02:56:10.3566273Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:56:10.3567604Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:56:10.3572623Z parallel (file granularity) tests: 2022-11-23T02:56:10.3572908Z 2022-11-23T02:56:10.3573174Z serial (file granularity) tests: 2022-11-23T02:56:10.3573528Z distributed/fsdp/test_fsdp_apply.py 2022-11-23T02:56:12.6307093Z Ignoring disabled issues: [] 2022-11-23T02:56:13.0475371Z Running distributed/fsdp/test_fsdp_apply.py ... [2022-11-23 02:56:13.046987] 2022-11-23T02:56:13.0477996Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_apply.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:56:13.047432] 2022-11-23T02:56:29.9575583Z 2022-11-23T02:56:29.9576437Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_apply 2022-11-23T02:56:29.9577598Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_apply (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_apply_z24pa8yd) 2022-11-23T02:56:29.9577911Z 2022-11-23T02:56:29.9578035Z Running tests... 2022-11-23T02:56:29.9578623Z ---------------------------------------------------------------------- 2022-11-23T02:56:29.9579205Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_apply 2022-11-23T02:56:29.9579632Z test_apply_in_summon_raises_error (__main__.TestApply) 2022-11-23T02:56:29.9580095Z Tests that calling ``apply()`` on an FSDP instance inside the ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:56:29.9580682Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121004 2022-11-23T02:56:29.9581048Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121005 2022-11-23T02:56:29.9581702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9582236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9582883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9583262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9584268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9584759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9585879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9586360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9586855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:56:29.9587598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:56:29.9588282Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9589050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9589622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:56:29.9590021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:56:29.9591318Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9592248Z warnings.warn( 2022-11-23T02:56:29.9593393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9594254Z warnings.warn( 2022-11-23T02:56:29.9594552Z File "", line 1, in 2022-11-23T02:56:29.9594937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:56:29.9595221Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:56:29.9595612Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:56:29.9596012Z return self._bootstrap(parent_sentinel) 2022-11-23T02:56:29.9596402Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:56:29.9596757Z self.run() 2022-11-23T02:56:29.9597116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:56:29.9597499Z self._target(*self._args, **self._kwargs) 2022-11-23T02:56:29.9598004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:56:29.9598511Z self.run_test(test_name, pipe) 2022-11-23T02:56:29.9599032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:56:29.9599442Z getattr(self, test_name)() 2022-11-23T02:56:29.9599975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:56:29.9600334Z fn() 2022-11-23T02:56:29.9600832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:56:29.9601238Z return func(*args, **kwargs) 2022-11-23T02:56:29.9601648Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_apply.py", line 98, in test_apply_in_summon_raises_error 2022-11-23T02:56:29.9602104Z transformer.apply(self._init_linear_weights) 2022-11-23T02:56:29.9602691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 492, in apply 2022-11-23T02:56:29.9603137Z self._assert_state(TrainingState.IDLE) 2022-11-23T02:56:29.9603699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1084, in _assert_state 2022-11-23T02:56:29.9604124Z traceback.print_stack() 2022-11-23T02:56:29.9604412Z dist init r=1, world=2 2022-11-23T02:56:29.9604660Z dist init r=0, world=2 2022-11-23T02:56:29.9604998Z Asserting FSDP instance is: FullyShardedDataParallel( 2022-11-23T02:56:29.9605534Z (_fsdp_wrapped_module): TransformerWithSharedParams( 2022-11-23T02:56:29.9605876Z (embed_tokens): Embedding(23, 16) 2022-11-23T02:56:29.9606182Z (transformer): Transformer( 2022-11-23T02:56:29.9606491Z (encoder): TransformerEncoder( 2022-11-23T02:56:29.9606789Z (layers): ModuleList( 2022-11-23T02:56:29.9607071Z (0): FullyShardedDataParallel( 2022-11-23T02:56:29.9607441Z (_fsdp_wrapped_module): TransformerEncoderLayer( 2022-11-23T02:56:29.9607801Z (self_attn): MultiheadAttention( 2022-11-23T02:56:29.9608208Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9608577Z ) 2022-11-23T02:56:29.9608897Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-11-23T02:56:29.9609239Z (dropout): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9609685Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-11-23T02:56:29.9610174Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9610649Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9610989Z (dropout1): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9611337Z (dropout2): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9611623Z ) 2022-11-23T02:56:29.9611833Z ) 2022-11-23T02:56:29.9612124Z (1): FullyShardedDataParallel( 2022-11-23T02:56:29.9612556Z (_fsdp_wrapped_module): TransformerEncoderLayer( 2022-11-23T02:56:29.9612838Z (self_attn): MultiheadAttention( 2022-11-23T02:56:29.9613268Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9613637Z ) 2022-11-23T02:56:29.9613934Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-11-23T02:56:29.9614304Z (dropout): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9614673Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-11-23T02:56:29.9615139Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9615587Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9616043Z (dropout1): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9616315Z (dropout2): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9616578Z ) 2022-11-23T02:56:29.9616818Z ) 2022-11-23T02:56:29.9617045Z ) 2022-11-23T02:56:29.9617407Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9617716Z ) 2022-11-23T02:56:29.9617988Z (decoder): TransformerDecoder( 2022-11-23T02:56:29.9618266Z (layers): ModuleList( 2022-11-23T02:56:29.9618581Z (0): FullyShardedDataParallel( 2022-11-23T02:56:29.9618954Z (_fsdp_wrapped_module): TransformerDecoderLayer( 2022-11-23T02:56:29.9619290Z (self_attn): MultiheadAttention( 2022-11-23T02:56:29.9619725Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9620091Z ) 2022-11-23T02:56:29.9620382Z (multihead_attn): MultiheadAttention( 2022-11-23T02:56:29.9620791Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9621154Z ) 2022-11-23T02:56:29.9621472Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-11-23T02:56:29.9621817Z (dropout): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9622183Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-11-23T02:56:29.9622657Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9623167Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9623648Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9624452Z (dropout1): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9624723Z (dropout2): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9625038Z (dropout3): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9625317Z ) 2022-11-23T02:56:29.9625546Z ) 2022-11-23T02:56:29.9625807Z (1): FullyShardedDataParallel( 2022-11-23T02:56:29.9626181Z (_fsdp_wrapped_module): TransformerDecoderLayer( 2022-11-23T02:56:29.9626543Z (self_attn): MultiheadAttention( 2022-11-23T02:56:29.9626955Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9627316Z ) 2022-11-23T02:56:29.9627719Z (multihead_attn): MultiheadAttention( 2022-11-23T02:56:29.9628137Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-11-23T02:56:29.9628504Z ) 2022-11-23T02:56:29.9628819Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-11-23T02:56:29.9629184Z (dropout): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9629527Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-11-23T02:56:29.9630012Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9630478Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9630904Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9631268Z (dropout1): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9631615Z (dropout2): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9632005Z (dropout3): Dropout(p=0.1, inplace=False) 2022-11-23T02:56:29.9632226Z ) 2022-11-23T02:56:29.9632463Z ) 2022-11-23T02:56:29.9632668Z ) 2022-11-23T02:56:29.9633049Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-11-23T02:56:29.9633345Z ) 2022-11-23T02:56:29.9633543Z ) 2022-11-23T02:56:29.9633890Z (output_proj): Linear(in_features=16, out_features=23, bias=True) 2022-11-23T02:56:29.9634467Z (bn): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 2022-11-23T02:56:29.9634734Z ) 2022-11-23T02:56:29.9634928Z ) 2022-11-23T02:56:29.9635318Z ERROR: expected to be in states [] but current state is TrainingState.SUMMON_FULL_PARAMS 2022-11-23T02:56:29.9635702Z ok (5.809s) 2022-11-23T02:56:29.9635971Z test_nested_module_apply (__main__.TestApply) 2022-11-23T02:56:29.9636610Z Tests that ``apply()`` modifies parameter values in-place on a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121147 2022-11-23T02:56:29.9637177Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121148 2022-11-23T02:56:29.9637777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9638232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9638811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9639295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9639864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9640331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9640918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9641486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9641942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:56:29.9642457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:56:29.9643128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9643807Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9644347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:56:29.9644828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:56:29.9646101Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9646974Z warnings.warn( 2022-11-23T02:56:29.9648112Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9648898Z warnings.warn( 2022-11-23T02:56:29.9649170Z dist init r=1, world=2 2022-11-23T02:56:29.9649434Z dist init r=0, world=2 2022-11-23T02:56:29.9649667Z ok (4.214s) 2022-11-23T02:56:29.9649978Z test_transformer_module_apply (__main__.TestApply) 2022-11-23T02:56:29.9650626Z Tests that ``apply()`` modifies parameter values in-place on an ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121290 2022-11-23T02:56:29.9651156Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121291 2022-11-23T02:56:29.9651779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9652244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9652839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9653299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9653895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:56:29.9654472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:56:29.9655034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:56:29.9655515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:56:29.9655984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:56:29.9656483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:56:29.9657130Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9657885Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:56:29.9658427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:56:29.9658973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:56:29.9660235Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9661028Z warnings.warn( 2022-11-23T02:56:29.9662193Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:56:29.9663137Z warnings.warn( 2022-11-23T02:56:29.9663409Z dist init r=1, world=2 2022-11-23T02:56:29.9663649Z dist init r=0, world=2 2022-11-23T02:56:29.9664312Z ok (4.515s) 2022-11-23T02:56:29.9664461Z 2022-11-23T02:56:29.9664819Z ---------------------------------------------------------------------- 2022-11-23T02:56:29.9665159Z Ran 3 tests in 14.538s 2022-11-23T02:56:29.9665317Z 2022-11-23T02:56:29.9665429Z OK 2022-11-23T02:56:29.9665585Z 2022-11-23T02:56:29.9665719Z Generating XML reports... 2022-11-23T02:56:29.9666246Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_apply/TEST-TestApply-20221123025614.xml 2022-11-23T02:56:29.9666521Z 2022-11-23T02:56:29.9666859Z ##[endgroup] 2022-11-23T02:56:29.9667473Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_apply (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_apply_z24pa8yd) 2022-11-23T02:56:29.9667843Z 2022-11-23T02:56:30.3187914Z 2022-11-23T02:56:30.3188456Z real 0m22.384s 2022-11-23T02:56:30.3188811Z user 0m41.593s 2022-11-23T02:56:30.3189074Z sys 0m35.237s 2022-11-23T02:56:30.3189359Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:56:30.3190017Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_checkpoint.py 2022-11-23T02:56:32.6644861Z Ignoring disabled issues: [] 2022-11-23T02:56:32.7172256Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:56:32.7172740Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:56:32.7173120Z Selected tests: 2022-11-23T02:56:32.7173460Z distributed/fsdp/test_fsdp_checkpoint.py 2022-11-23T02:56:32.7197528Z Prioritized test from test file changes. 2022-11-23T02:56:32.7197964Z reordering tests for PR: 2022-11-23T02:56:32.7198254Z prioritized: [] 2022-11-23T02:56:32.7198838Z the rest: ['distributed/fsdp/test_fsdp_checkpoint.py'] 2022-11-23T02:56:32.7199060Z 2022-11-23T02:56:32.7199572Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:56:32.7200528Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:56:32.7204646Z parallel (file granularity) tests: 2022-11-23T02:56:32.7204942Z 2022-11-23T02:56:32.7205245Z serial (file granularity) tests: 2022-11-23T02:56:32.7205813Z distributed/fsdp/test_fsdp_checkpoint.py 2022-11-23T02:56:35.0170582Z Ignoring disabled issues: [] 2022-11-23T02:56:35.4157324Z Running distributed/fsdp/test_fsdp_checkpoint.py ... [2022-11-23 02:56:35.415000] 2022-11-23T02:56:35.4158661Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:56:35.415478] 2022-11-23T02:57:59.1778217Z 2022-11-23T02:57:59.1778844Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_checkpoint 2022-11-23T02:57:59.1779870Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_checkpoint_z7a2mtl6) 2022-11-23T02:57:59.1780451Z 2022-11-23T02:57:59.1780672Z Running tests... 2022-11-23T02:57:59.1784284Z ---------------------------------------------------------------------- 2022-11-23T02:57:59.1784940Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint 2022-11-23T02:57:59.1785608Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:57:59.1786578Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121645 2022-11-23T02:57:59.1787043Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121646 2022-11-23T02:57:59.1787509Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 121647 2022-11-23T02:57:59.1787943Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 121648 2022-11-23T02:57:59.1788580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1789022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1789600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1790074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1790673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1791118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1791692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1792164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1792730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1793163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1798872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1799454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1800087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1800565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1801153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1801609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1802070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1802705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1803412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1804412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1805588Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1807084Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1808321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1809522Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1810453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1811290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1812063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1813362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1813728Z dist init r=2, world=4 2022-11-23T02:57:59.1814076Z dist init r=3, world=4 2022-11-23T02:57:59.1814306Z dist init r=1, world=4 2022-11-23T02:57:59.1814556Z dist init r=0, world=4 2022-11-23T02:57:59.1814796Z ok (6.776s) 2022-11-23T02:57:59.1815349Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121946 2022-11-23T02:57:59.1816087Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121947 2022-11-23T02:57:59.1816536Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 121948 2022-11-23T02:57:59.1816980Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 121949 2022-11-23T02:57:59.1817589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1818048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1818631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1819105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1819665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1820115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1820692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1821142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1821722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1822166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1822736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1823189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1823767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1824638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1825229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1825674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1826131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1826635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1827105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1827706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1828383Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1829076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1829742Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1830424Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1830944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1831414Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1831957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1832416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1832773Z dist init r=0, world=4 2022-11-23T02:57:59.1833009Z dist init r=1, world=4 2022-11-23T02:57:59.1833258Z dist init r=3, world=4 2022-11-23T02:57:59.1833506Z dist init r=2, world=4 2022-11-23T02:57:59.1833721Z ok (5.021s) 2022-11-23T02:57:59.1834287Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122247 2022-11-23T02:57:59.1834944Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122248 2022-11-23T02:57:59.1835396Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122249 2022-11-23T02:57:59.1835830Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122250 2022-11-23T02:57:59.1836457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1836914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1837529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1837987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1838569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1839015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1839767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1840235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1840819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1841261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1841816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1842276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1842849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1843277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1843841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1844306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1844819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1845304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1845789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1846276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1846930Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1847596Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1848331Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1849019Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1849607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1850061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1850532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1851007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1851342Z dist init r=1, world=4 2022-11-23T02:57:59.1851596Z dist init r=2, world=4 2022-11-23T02:57:59.1851842Z dist init r=3, world=4 2022-11-23T02:57:59.1852071Z dist init r=0, world=4 2022-11-23T02:57:59.1852304Z ok (5.021s) 2022-11-23T02:57:59.1852869Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122548 2022-11-23T02:57:59.1853527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122549 2022-11-23T02:57:59.1853955Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122550 2022-11-23T02:57:59.1854405Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122551 2022-11-23T02:57:59.1855019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1855452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1856027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1856493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1857072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1857500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1858073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1858534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1859114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1859535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1860103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1860563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1861118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1861560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1862182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1862656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1863088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1863587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1864366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1864849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1865518Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1866208Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1866993Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1867802Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1868297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1868757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1869203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1869635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1869973Z dist init r=3, world=4 2022-11-23T02:57:59.1870214Z dist init r=0, world=4 2022-11-23T02:57:59.1870441Z dist init r=1, world=4 2022-11-23T02:57:59.1870676Z dist init r=2, world=4 2022-11-23T02:57:59.1870905Z ok (5.020s) 2022-11-23T02:57:59.1871488Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122849 2022-11-23T02:57:59.1872128Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122850 2022-11-23T02:57:59.1872742Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122851 2022-11-23T02:57:59.1873187Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122852 2022-11-23T02:57:59.1873781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1874235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1874808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1875285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1875988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1876418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1876967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1877393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1878141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1878585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1879219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1879744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1880333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1880923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1881657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1882096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1882545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1883039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1883511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1884223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1885040Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1885736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1886400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1887082Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1887598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1888069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1888514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1888981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1889336Z dist init r=1, world=4 2022-11-23T02:57:59.1889573Z dist init r=2, world=4 2022-11-23T02:57:59.1889822Z dist init r=0, world=4 2022-11-23T02:57:59.1890067Z dist init r=3, world=4 2022-11-23T02:57:59.1890283Z ok (5.021s) 2022-11-23T02:57:59.1890846Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123150 2022-11-23T02:57:59.1891497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123151 2022-11-23T02:57:59.1892099Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123152 2022-11-23T02:57:59.1892511Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123153 2022-11-23T02:57:59.1893110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1893548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1894105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1894540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1895284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1895727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1896276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1896739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1897385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1897996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1898533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1898980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1899536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1899940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1900684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1901142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1901593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1902127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1902618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1903254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1904283Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1904974Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1905654Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1906330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1906853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1907303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1908096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1908562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1908899Z dist init r=3, world=4 2022-11-23T02:57:59.1909142Z dist init r=1, world=4 2022-11-23T02:57:59.1909391Z dist init r=2, world=4 2022-11-23T02:57:59.1909630Z dist init r=0, world=4 2022-11-23T02:57:59.1909847Z ok (5.020s) 2022-11-23T02:57:59.1910404Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123451 2022-11-23T02:57:59.1911050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123452 2022-11-23T02:57:59.1911823Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123453 2022-11-23T02:57:59.1912245Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123454 2022-11-23T02:57:59.1912849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1913299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1913852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1914319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1914893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1915331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1915990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1916623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1917178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1917599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1918129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1918569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1919120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1919525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1920148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1920591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1921020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1921480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1921947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1922408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1923033Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1923671Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1924514Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1925188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1925699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1926146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1926600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1927064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1927557Z dist init r=0, world=4 2022-11-23T02:57:59.1927797Z dist init r=3, world=4 2022-11-23T02:57:59.1928208Z dist init r=1, world=4 2022-11-23T02:57:59.1928434Z dist init r=2, world=4 2022-11-23T02:57:59.1928672Z ok (4.921s) 2022-11-23T02:57:59.1929234Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123752 2022-11-23T02:57:59.1929882Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123753 2022-11-23T02:57:59.1930314Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123754 2022-11-23T02:57:59.1930758Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123755 2022-11-23T02:57:59.1931366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1931795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1932364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1932988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1933596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1934013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1934562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1935006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1935539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1935963Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1936511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1937008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1937601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1938030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1938575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1939012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1939428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1939903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1940370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1940818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1941456Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1942114Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1942941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1943596Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1944383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1944851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1945310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1945758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1946104Z dist init r=3, world=4 2022-11-23T02:57:59.1946354Z dist init r=1, world=4 2022-11-23T02:57:59.1946581Z dist init r=2, world=4 2022-11-23T02:57:59.1946822Z dist init r=0, world=4 2022-11-23T02:57:59.1947053Z ok (4.920s) 2022-11-23T02:57:59.1947761Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124053 2022-11-23T02:57:59.1948389Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124054 2022-11-23T02:57:59.1949006Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124055 2022-11-23T02:57:59.1949444Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124056 2022-11-23T02:57:59.1950035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1950565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1951148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1951768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1952304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1952725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1953265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1953692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1954242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1954742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1955290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1955717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1956452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1956890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1957440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1957897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1958345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1958836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1959304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1959784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1960428Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1961106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1961768Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1962605Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1963281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1963753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1964195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1964646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1964996Z dist init r=3, world=4 2022-11-23T02:57:59.1965226Z dist init r=1, world=4 2022-11-23T02:57:59.1965470Z dist init r=2, world=4 2022-11-23T02:57:59.1965713Z dist init r=0, world=4 2022-11-23T02:57:59.1966087Z ok (4.920s) 2022-11-23T02:57:59.1966632Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124354 2022-11-23T02:57:59.1967255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124355 2022-11-23T02:57:59.1967747Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124356 2022-11-23T02:57:59.1968166Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124357 2022-11-23T02:57:59.1968757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1969184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1969717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1970165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1970717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1971143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1971727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1972235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1972787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1973388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1973937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1974391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1974959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1975374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1976091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1976540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1976970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1977425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1978065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1978552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1979198Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1979863Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1980537Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1981367Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1982047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.1982494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.1982947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.1983406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.1983739Z dist init r=2, world=4 2022-11-23T02:57:59.1984288Z dist init r=3, world=4 2022-11-23T02:57:59.1984538Z dist init r=1, world=4 2022-11-23T02:57:59.1984762Z dist init r=0, world=4 2022-11-23T02:57:59.1984992Z ok (4.920s) 2022-11-23T02:57:59.1985639Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124655 2022-11-23T02:57:59.1986301Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124656 2022-11-23T02:57:59.1986726Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124657 2022-11-23T02:57:59.1987164Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124658 2022-11-23T02:57:59.1987780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1988206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1988773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1989235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1989890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1990316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1990880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1991340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1991890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1992484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1993027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1993468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1994007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.1994435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.1994976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.1995416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.1995826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.1996298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.1996769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.1997221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.1997850Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1998508Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1999158Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.1999793Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2000287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2000932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2001394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2001841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2002193Z dist init r=1, world=4 2022-11-23T02:57:59.2002499Z dist init r=2, world=4 2022-11-23T02:57:59.2002735Z dist init r=0, world=4 2022-11-23T02:57:59.2002979Z dist init r=3, world=4 2022-11-23T02:57:59.2003210Z ok (4.920s) 2022-11-23T02:57:59.2003899Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124956 2022-11-23T02:57:59.2004908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124957 2022-11-23T02:57:59.2005357Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124958 2022-11-23T02:57:59.2005797Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124959 2022-11-23T02:57:59.2006399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2006918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2007498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2007968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2008528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2008973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2009540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2009984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2010560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2010998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2011569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2012014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2012586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2013025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2013593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2014037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2014485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.2014983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.2015460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.2015943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.2016612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2017296Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2017954Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2018629Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2019142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2019620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2020134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2020629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2020984Z dist init r=1, world=4 2022-11-23T02:57:59.2021230Z dist init r=0, world=4 2022-11-23T02:57:59.2021458Z dist init r=3, world=4 2022-11-23T02:57:59.2021698Z dist init r=2, world=4 2022-11-23T02:57:59.2021928Z ok (4.920s) 2022-11-23T02:57:59.2022474Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 125257 2022-11-23T02:57:59.2023120Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 125258 2022-11-23T02:57:59.2023564Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 125259 2022-11-23T02:57:59.2024275Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 125260 2022-11-23T02:57:59.2024875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2025323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2025894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2026344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2026914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2027350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2027916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2028365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2028936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2029375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2029946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2030407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2030977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2031420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2031971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2032455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2032918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.2033407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.2033898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.2034367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.2035027Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2035717Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2036378Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2037155Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2037743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2038216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2038658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2039108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2039463Z dist init r=1, world=4 2022-11-23T02:57:59.2039700Z dist init r=0, world=4 2022-11-23T02:57:59.2039947Z dist init r=2, world=4 2022-11-23T02:57:59.2040194Z dist init r=3, world=4 2022-11-23T02:57:59.2040423Z ok (5.020s) 2022-11-23T02:57:59.2040988Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 125558 2022-11-23T02:57:59.2041800Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 125559 2022-11-23T02:57:59.2042254Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 125560 2022-11-23T02:57:59.2042697Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 125561 2022-11-23T02:57:59.2043294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2043743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2044311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2044758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2045329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2045774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2046339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2046777Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2047345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2047789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2048351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2048810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2049410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2049865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2050448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2050894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2051340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.2051833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.2052337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.2052805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.2053462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2054208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2054897Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2055551Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2056069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2056539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2057005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2057451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2057800Z dist init r=2, world=4 2022-11-23T02:57:59.2058109Z dist init r=3, world=4 2022-11-23T02:57:59.2058337Z dist init r=1, world=4 2022-11-23T02:57:59.2058589Z dist init r=0, world=4 2022-11-23T02:57:59.2058824Z ok (5.020s) 2022-11-23T02:57:59.2059368Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True_use_orig_params_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 125859 2022-11-23T02:57:59.2060019Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 125860 2022-11-23T02:57:59.2060625Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 125861 2022-11-23T02:57:59.2061065Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 125862 2022-11-23T02:57:59.2061635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2062071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2062641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2063098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2063824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2064491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2065068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2065517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2066090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2066530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2067107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2067714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2068261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2068686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2069211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2069660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2070089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.2070564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.2071013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.2071607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.2072243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2072901Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2073727Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2074404Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2074922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2075388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2075913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2076528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2076872Z dist init r=2, world=4 2022-11-23T02:57:59.2077101Z dist init r=0, world=4 2022-11-23T02:57:59.2077345Z dist init r=3, world=4 2022-11-23T02:57:59.2077584Z dist init r=1, world=4 2022-11-23T02:57:59.2077792Z ok (4.920s) 2022-11-23T02:57:59.2078520Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True_use_orig_params_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 126160 2022-11-23T02:57:59.2079174Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 126161 2022-11-23T02:57:59.2079629Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 126162 2022-11-23T02:57:59.2080060Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 126163 2022-11-23T02:57:59.2080689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2081313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2082038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2082517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2083106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2083557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2084105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2084576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2085161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2085603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2086153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2086614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2087186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:57:59.2087606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:57:59.2088175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:57:59.2088645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:57:59.2089169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:57:59.2089660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:57:59.2090157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:57:59.2090658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:57:59.2091300Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2091990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2092841Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2093501Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:57:59.2094042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:57:59.2094511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:57:59.2094968Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:57:59.2095424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:57:59.2095752Z dist init r=3, world=4 2022-11-23T02:57:59.2096181Z dist init r=1, world=4 2022-11-23T02:57:59.2096442Z dist init r=2, world=4 2022-11-23T02:57:59.2096669Z dist init r=0, world=4 2022-11-23T02:57:59.2096916Z ok (4.920s) 2022-11-23T02:57:59.2097073Z 2022-11-23T02:57:59.2097356Z ---------------------------------------------------------------------- 2022-11-23T02:57:59.2097673Z Ran 16 tests in 81.282s 2022-11-23T02:57:59.2097840Z 2022-11-23T02:57:59.2097932Z OK 2022-11-23T02:57:59.2098067Z 2022-11-23T02:57:59.2098199Z Generating XML reports... 2022-11-23T02:57:59.2098829Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint/TEST-TestFSDPCheckpoint-20221123025637.xml 2022-11-23T02:57:59.2099320Z 2022-11-23T02:57:59.2099698Z ##[endgroup] 2022-11-23T02:57:59.2100303Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_checkpoint_z7a2mtl6) 2022-11-23T02:57:59.2100658Z 2022-11-23T02:57:59.5137200Z 2022-11-23T02:57:59.5137471Z real 1m29.195s 2022-11-23T02:57:59.5137763Z user 4m44.061s 2022-11-23T02:57:59.5137986Z sys 3m5.556s 2022-11-23T02:57:59.5138280Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:57:59.5138829Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_clip_grad_norm.py 2022-11-23T02:58:01.8848774Z Ignoring disabled issues: [] 2022-11-23T02:58:01.9384330Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:58:01.9385335Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:58:01.9385703Z Selected tests: 2022-11-23T02:58:01.9386008Z distributed/fsdp/test_fsdp_clip_grad_norm.py 2022-11-23T02:58:01.9413042Z Prioritized test from test file changes. 2022-11-23T02:58:01.9413380Z reordering tests for PR: 2022-11-23T02:58:01.9413655Z prioritized: [] 2022-11-23T02:58:01.9414143Z the rest: ['distributed/fsdp/test_fsdp_clip_grad_norm.py'] 2022-11-23T02:58:01.9414363Z 2022-11-23T02:58:01.9414897Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:58:01.9415832Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:58:01.9422050Z parallel (file granularity) tests: 2022-11-23T02:58:01.9422602Z 2022-11-23T02:58:01.9422876Z serial (file granularity) tests: 2022-11-23T02:58:01.9423207Z distributed/fsdp/test_fsdp_clip_grad_norm.py 2022-11-23T02:58:04.2662594Z Ignoring disabled issues: [] 2022-11-23T02:58:04.6923260Z Running distributed/fsdp/test_fsdp_clip_grad_norm.py ... [2022-11-23 02:58:04.691755] 2022-11-23T02:58:04.6924648Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_clip_grad_norm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:58:04.692257] 2022-11-23T02:58:34.9908070Z 2022-11-23T02:58:34.9909154Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_clip_grad_norm 2022-11-23T02:58:34.9910579Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_clip_grad_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_clip_grad_norm_0dbk0t94) 2022-11-23T02:58:34.9911675Z 2022-11-23T02:58:34.9911830Z Running tests... 2022-11-23T02:58:34.9912366Z ---------------------------------------------------------------------- 2022-11-23T02:58:34.9912953Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm 2022-11-23T02:58:34.9913388Z test_ddp_parity (__main__.TestClipGradNorm) 2022-11-23T02:58:34.9913835Z Tests FSDP with ``FullyShardedDataParallel.clip_grad_norm_()` against ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:58:34.9916536Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 126673 2022-11-23T02:58:34.9917015Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 126674 2022-11-23T02:58:34.9917466Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 126675 2022-11-23T02:58:34.9917923Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 126676 2022-11-23T02:58:34.9918611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:34.9919071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:34.9919661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:34.9920138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:34.9920707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:34.9921162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:34.9921741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:34.9922294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:34.9922883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:34.9923337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:34.9923917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:34.9924382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:34.9924942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:34.9925396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:34.9925971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:34.9926416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:34.9926874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:34.9927372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:34.9928043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:58:34.9928538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:58:34.9929196Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:34.9929879Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:34.9930564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:34.9931223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:34.9931754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:58:34.9932311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:34.9932780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:58:34.9933221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:34.9933695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9934188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9934674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9935130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9935611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9936084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9936547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9937029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9937499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9937966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9938424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9938889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9939898Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9941212Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9942437Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9943661Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9945275Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9946523Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9947343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9947813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9948377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9948861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9949857Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9951084Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9952293Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9953525Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9954747Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9955470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9955958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9973022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9973620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9974152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9974671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9975193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9975685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9976202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9976708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9977317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9977855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9978367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9978875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9979364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9979872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9980383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9980870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9981379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9981968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9982475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9982971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9983486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9984420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9984916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9985405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9985893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9986382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9986858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9987337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9987899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9988384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9989412Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:34.9990173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9990661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9991146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9991610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9992089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9992564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9993019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9993498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9993973Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9994445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9995013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9995508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9995985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9996445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9996926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9997403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9997880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9998334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9998805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9999361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:34.9999830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0000292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0000769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0001246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0002237Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0002986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0003608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0004092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0004554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0005028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0005498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0005970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0006430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0006902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0007373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0007831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0008309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0008782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0009252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0009706Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0010173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0011304Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0012574Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0013784Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0015167Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0016464Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0017368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0017846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0018313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0018758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0019223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0019688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0020148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0020591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0021545Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0022947Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0024429Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0025189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0025658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0026140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0026620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0027079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0027564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0028148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0028639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0029097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0029574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0030056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0030665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0031127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0031581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0032035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0032560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0033192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0033667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0034147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0034603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0035072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0035541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0035991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0036461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0037471Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0038359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0038811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0039279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0039743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0040202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0040645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0041106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0041562Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0042001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0042643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0043183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0043664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0044121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0044596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0045133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0045766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0046711Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0047910Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0049100Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0050562Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0051805Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:35.0052698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0053158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0053627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0054085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0054725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0055188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0055664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0056138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:35.0056480Z dist init r=3, world=4 2022-11-23T02:58:35.0056733Z dist init r=0, world=4 2022-11-23T02:58:35.0056981Z dist init r=1, world=4 2022-11-23T02:58:35.0057212Z dist init r=2, world=4 2022-11-23T02:58:35.0057610Z ok (23.065s) 2022-11-23T02:58:35.0057886Z test_non_root (__main__.TestClipGradNorm) 2022-11-23T02:58:35.0058486Z Tests that calling ``clip_grad_norm_()`` on a non-root FSDP instance ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 126974 2022-11-23T02:58:35.0059189Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 126975 2022-11-23T02:58:35.0059651Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 126976 2022-11-23T02:58:35.0060102Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 126977 2022-11-23T02:58:35.0060699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:35.0061159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:35.0061742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:35.0062378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:35.0063163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:35.0063625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:35.0064417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:35.0064878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:35.0065460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:35.0065912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:35.0066488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:35.0066937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:35.0067614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:35.0068065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:35.0068618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:35.0069087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:35.0069696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:58:35.0070185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:58:35.0070649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:35.0071128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:35.0071770Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:35.0072439Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:35.0073088Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:35.0073952Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:58:35.0074475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:58:35.0074982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:35.0075437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:58:35.0075905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:35.0076424Z dist init r=2, world=4 2022-11-23T02:58:35.0076653Z dist init r=0, world=4 2022-11-23T02:58:35.0076893Z dist init r=3, world=4 2022-11-23T02:58:35.0077134Z dist init r=1, world=4 2022-11-23T02:58:35.0077345Z ok (4.818s) 2022-11-23T02:58:35.0077491Z 2022-11-23T02:58:35.0077759Z ---------------------------------------------------------------------- 2022-11-23T02:58:35.0078086Z Ran 2 tests in 27.884s 2022-11-23T02:58:35.0078245Z 2022-11-23T02:58:35.0078317Z OK 2022-11-23T02:58:35.0078452Z 2022-11-23T02:58:35.0078576Z Generating XML reports... 2022-11-23T02:58:35.0079158Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20221123025806.xml 2022-11-23T02:58:35.0079689Z 2022-11-23T02:58:35.0080095Z ##[endgroup] 2022-11-23T02:58:35.0080701Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_clip_grad_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_clip_grad_norm_0dbk0t94) 2022-11-23T02:58:35.0081170Z 2022-11-23T02:58:35.3257382Z 2022-11-23T02:58:35.3257975Z real 0m35.812s 2022-11-23T02:58:35.3258316Z user 1m52.889s 2022-11-23T02:58:35.3258593Z sys 0m36.869s 2022-11-23T02:58:35.3259095Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:58:35.3259654Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_comm.py 2022-11-23T02:58:37.7185044Z Ignoring disabled issues: [] 2022-11-23T02:58:37.7705932Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:58:37.7706550Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:58:37.7706917Z Selected tests: 2022-11-23T02:58:37.7707183Z distributed/fsdp/test_fsdp_comm.py 2022-11-23T02:58:37.7731194Z Prioritized test from test file changes. 2022-11-23T02:58:37.7731762Z reordering tests for PR: 2022-11-23T02:58:37.7732447Z prioritized: [] 2022-11-23T02:58:37.7732950Z the rest: ['distributed/fsdp/test_fsdp_comm.py'] 2022-11-23T02:58:37.7733163Z 2022-11-23T02:58:37.7733701Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:58:37.7734653Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:58:37.7739968Z parallel (file granularity) tests: 2022-11-23T02:58:37.7740276Z 2022-11-23T02:58:37.7740528Z serial (file granularity) tests: 2022-11-23T02:58:37.7740837Z distributed/fsdp/test_fsdp_comm.py 2022-11-23T02:58:40.0872653Z Ignoring disabled issues: [] 2022-11-23T02:58:40.4917417Z Running distributed/fsdp/test_fsdp_comm.py ... [2022-11-23 02:58:40.491055] 2022-11-23T02:58:40.4918212Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:58:40.491506] 2022-11-23T02:59:25.4435771Z 2022-11-23T02:59:25.4436572Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_comm 2022-11-23T02:59:25.4439503Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm__vuoscpo) 2022-11-23T02:59:25.4439902Z 2022-11-23T02:59:25.4440087Z Running tests... 2022-11-23T02:59:25.4440624Z ---------------------------------------------------------------------- 2022-11-23T02:59:25.4441175Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm 2022-11-23T02:59:25.4442435Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:59:25.4443629Z Tests FSDP's communication cost in terms of calls to collective ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:59:25.4444287Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 127487 2022-11-23T02:59:25.4444742Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 127488 2022-11-23T02:59:25.4445257Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 127489 2022-11-23T02:59:25.4445600Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 127490 2022-11-23T02:59:25.4446311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4446767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4447482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4447888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4448546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4449459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4450117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4450665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4451276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4451760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4452544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4453090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4453624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4454321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4454851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4455296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4455765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4456283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4456767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4457271Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4457948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4458642Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4459469Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4460170Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4460700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4461158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4461652Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4462110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4462499Z dist init r=0, world=4 2022-11-23T02:59:25.4462657Z dist init r=1, world=4 2022-11-23T02:59:25.4462930Z dist init r=2, world=4 2022-11-23T02:59:25.4463204Z dist init r=3, world=4 2022-11-23T02:59:25.4463455Z ok (6.968s) 2022-11-23T02:59:25.4464457Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:59:25.4465254Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 127788 2022-11-23T02:59:25.4465837Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 127789 2022-11-23T02:59:25.4466278Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 127790 2022-11-23T02:59:25.4466746Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 127791 2022-11-23T02:59:25.4467341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4467827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4468505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4469016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4469600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4470029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4470631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4471090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4471720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4472161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4472821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4473255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4473848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4474288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4474877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4475368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4475840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4476323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4476829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4477341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4477988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4478693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4479395Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4480089Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4480599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4481209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4481708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4482189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4482535Z dist init r=3, world=4 2022-11-23T02:59:25.4482802Z dist init r=2, world=4 2022-11-23T02:59:25.4483064Z dist init r=0, world=4 2022-11-23T02:59:25.4483300Z dist init r=1, world=4 2022-11-23T02:59:25.4483618Z ok (5.221s) 2022-11-23T02:59:25.4483949Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:59:25.4484651Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128089 2022-11-23T02:59:25.4485205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128090 2022-11-23T02:59:25.4485668Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128091 2022-11-23T02:59:25.4486256Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128092 2022-11-23T02:59:25.4486867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4487339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4487921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4488388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4488981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4489441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4490024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4490545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4491136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4491598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4492160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4492640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4493228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4493689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4494244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4494735Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4495203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4495719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4496204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4496749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4497416Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4498096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4498831Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4499574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4500120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4500584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4501060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4501546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4501922Z dist init r=0, world=4 2022-11-23T02:59:25.4502207Z dist init r=3, world=4 2022-11-23T02:59:25.4502473Z dist init r=1, world=4 2022-11-23T02:59:25.4502802Z dist init r=2, world=4 2022-11-23T02:59:25.4503028Z ok (5.322s) 2022-11-23T02:59:25.4503463Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:59:25.4504950Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128390 2022-11-23T02:59:25.4505477Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128391 2022-11-23T02:59:25.4505940Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128392 2022-11-23T02:59:25.4506416Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128393 2022-11-23T02:59:25.4507002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4507398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4507987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4508477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4509132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4509595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4510180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4510665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4511297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4511855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4512348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4512831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4513403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4513868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4514447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4514906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4515373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4515987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4516497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4516977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4517646Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4518361Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4519066Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4519733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4520272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4520760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4521216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4521704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4522074Z dist init r=2, world=4 2022-11-23T02:59:25.4522401Z dist init r=1, world=4 2022-11-23T02:59:25.4522647Z dist init r=3, world=4 2022-11-23T02:59:25.4522909Z dist init r=0, world=4 2022-11-23T02:59:25.4523241Z ok (5.322s) 2022-11-23T02:59:25.4523531Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:59:25.4524255Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128691 2022-11-23T02:59:25.4524805Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128692 2022-11-23T02:59:25.4525246Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128693 2022-11-23T02:59:25.4525715Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128694 2022-11-23T02:59:25.4526335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4526864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4527432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4527917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4528508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4528971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4529528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4530013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4530593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4531025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4531601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4532082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4532673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4533189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4533688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4534157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4534596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4535145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4535626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4536133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4536777Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4537479Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4538178Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4538882Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4539362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4539904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4540385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4540860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4542116Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4542951Z warnings.warn( 2022-11-23T02:59:25.4544504Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4545389Z warnings.warn( 2022-11-23T02:59:25.4546561Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4547409Z warnings.warn( 2022-11-23T02:59:25.4548531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4549327Z warnings.warn( 2022-11-23T02:59:25.4549581Z dist init r=0, world=4 2022-11-23T02:59:25.4549837Z dist init r=3, world=4 2022-11-23T02:59:25.4550065Z dist init r=2, world=4 2022-11-23T02:59:25.4550314Z dist init r=1, world=4 2022-11-23T02:59:25.4550551Z ok (4.920s) 2022-11-23T02:59:25.4550957Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:59:25.4551700Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128992 2022-11-23T02:59:25.4552259Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128993 2022-11-23T02:59:25.4552733Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128994 2022-11-23T02:59:25.4553167Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128995 2022-11-23T02:59:25.4553791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4554259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4554824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4555313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4555947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4556372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4557003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4557505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4558094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4558555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4559111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4559640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4560261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4560694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4561347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4561831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4562302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4562788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4563337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4563844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4564510Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4565189Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4565900Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4566604Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4567137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4567599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4568073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4568560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4569816Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4570629Z warnings.warn( 2022-11-23T02:59:25.4571794Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4572593Z warnings.warn( 2022-11-23T02:59:25.4573935Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4574751Z warnings.warn( 2022-11-23T02:59:25.4575874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4576661Z warnings.warn( 2022-11-23T02:59:25.4576925Z dist init r=2, world=4 2022-11-23T02:59:25.4577248Z dist init r=1, world=4 2022-11-23T02:59:25.4577482Z dist init r=3, world=4 2022-11-23T02:59:25.4577748Z dist init r=0, world=4 2022-11-23T02:59:25.4577996Z ok (4.820s) 2022-11-23T02:59:25.4578366Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:59:25.4579088Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 129293 2022-11-23T02:59:25.4579646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 129294 2022-11-23T02:59:25.4580111Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 129295 2022-11-23T02:59:25.4580545Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 129296 2022-11-23T02:59:25.4581171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4581641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4582211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4582700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4583300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4583770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4584746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4585214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4585810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4586204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4586790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4587275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4587865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4588307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4588948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4589482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4589899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4590388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4590898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4591481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4592252Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4592845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4593575Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4594247Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4594758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4595246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4595801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4596281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4597530Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4598335Z warnings.warn( 2022-11-23T02:59:25.4599492Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4600295Z warnings.warn( 2022-11-23T02:59:25.4601445Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4602374Z warnings.warn( 2022-11-23T02:59:25.4603446Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4604250Z warnings.warn( 2022-11-23T02:59:25.4604514Z dist init r=1, world=4 2022-11-23T02:59:25.4604850Z dist init r=3, world=4 2022-11-23T02:59:25.4605016Z dist init r=2, world=4 2022-11-23T02:59:25.4605282Z dist init r=0, world=4 2022-11-23T02:59:25.4605530Z ok (5.021s) 2022-11-23T02:59:25.4605937Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:59:25.4606782Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 129594 2022-11-23T02:59:25.4607277Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 129595 2022-11-23T02:59:25.4607980Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 129596 2022-11-23T02:59:25.4608343Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 129597 2022-11-23T02:59:25.4608978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4609447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4610014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4610509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4611120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4611670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4612301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4612783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4613371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4613830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4614388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4614867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4615454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:59:25.4615884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:59:25.4616471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:59:25.4616951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:59:25.4617417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:59:25.4617905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:59:25.4618410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:59:25.4618915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:59:25.4619553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4620261Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4620966Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4621667Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:59:25.4622174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:59:25.4622661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:59:25.4623139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:59:25.4623619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:59:25.4625167Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4625994Z warnings.warn( 2022-11-23T02:59:25.4627164Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4627957Z warnings.warn( 2022-11-23T02:59:25.4629106Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4629973Z warnings.warn( 2022-11-23T02:59:25.4631099Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:59:25.4631897Z warnings.warn( 2022-11-23T02:59:25.4632158Z dist init r=1, world=4 2022-11-23T02:59:25.4632424Z dist init r=2, world=4 2022-11-23T02:59:25.4632668Z dist init r=0, world=4 2022-11-23T02:59:25.4632929Z dist init r=3, world=4 2022-11-23T02:59:25.4633182Z ok (4.921s) 2022-11-23T02:59:25.4633340Z 2022-11-23T02:59:25.4633596Z ---------------------------------------------------------------------- 2022-11-23T02:59:25.4633938Z Ran 8 tests in 42.517s 2022-11-23T02:59:25.4634108Z 2022-11-23T02:59:25.4634208Z OK 2022-11-23T02:59:25.4634348Z 2022-11-23T02:59:25.4634528Z Generating XML reports... 2022-11-23T02:59:25.4635109Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20221123025842.xml 2022-11-23T02:59:25.4635478Z 2022-11-23T02:59:25.4635973Z ##[endgroup] 2022-11-23T02:59:25.4636541Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm__vuoscpo) 2022-11-23T02:59:25.4636901Z 2022-11-23T02:59:25.8367511Z 2022-11-23T02:59:25.8368391Z real 0m50.511s 2022-11-23T02:59:25.8368742Z user 2m31.768s 2022-11-23T02:59:25.8368992Z sys 1m39.429s 2022-11-23T02:59:25.8369218Z + for f in test/distributed/fsdp/*.py 2022-11-23T02:59:25.8369906Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_comm_hooks.py 2022-11-23T02:59:28.2331301Z Ignoring disabled issues: [] 2022-11-23T02:59:28.2858707Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T02:59:28.2859395Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T02:59:28.2859837Z Selected tests: 2022-11-23T02:59:28.2860119Z distributed/fsdp/test_fsdp_comm_hooks.py 2022-11-23T02:59:28.2889684Z Prioritized test from test file changes. 2022-11-23T02:59:28.2890052Z reordering tests for PR: 2022-11-23T02:59:28.2890763Z prioritized: [] 2022-11-23T02:59:28.2891366Z the rest: ['distributed/fsdp/test_fsdp_comm_hooks.py'] 2022-11-23T02:59:28.2891521Z 2022-11-23T02:59:28.2892043Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T02:59:28.2893309Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T02:59:28.2898947Z parallel (file granularity) tests: 2022-11-23T02:59:28.2899615Z 2022-11-23T02:59:28.2899906Z serial (file granularity) tests: 2022-11-23T02:59:28.2900260Z distributed/fsdp/test_fsdp_comm_hooks.py 2022-11-23T02:59:30.6201208Z Ignoring disabled issues: [] 2022-11-23T02:59:31.0572357Z Running distributed/fsdp/test_fsdp_comm_hooks.py ... [2022-11-23 02:59:31.056634] 2022-11-23T02:59:31.0573212Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:59:31.057069] 2022-11-23T03:01:44.4899205Z 2022-11-23T03:01:44.4900135Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T03:01:44.4901693Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_gtmjrjfq) 2022-11-23T03:01:44.4906824Z 2022-11-23T03:01:44.4907669Z Running tests... 2022-11-23T03:01:44.4908388Z ---------------------------------------------------------------------- 2022-11-23T03:01:44.4909143Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks 2022-11-23T03:01:44.4910296Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:01:44.4911250Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 130107 2022-11-23T03:01:44.4911740Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 130108 2022-11-23T03:01:44.4912199Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 130109 2022-11-23T03:01:44.4912664Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 130110 2022-11-23T03:01:44.4913291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4913761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4914347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4914824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4915400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4915857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4916449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4916927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4917502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4917958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4918523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4918950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4919539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4920027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4920626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4921319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4922000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.4922512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.4923020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.4923520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.4924166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4924874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4925560Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4926356Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4926861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.4927348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.4927827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.4928298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.4928704Z dist init r=2, world=4 2022-11-23T03:01:44.4928960Z dist init r=0, world=4 2022-11-23T03:01:44.4929208Z dist init r=1, world=4 2022-11-23T03:01:44.4929432Z dist init r=3, world=4 2022-11-23T03:01:44.4929673Z ok (6.643s) 2022-11-23T03:01:44.4930186Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 130408 2022-11-23T03:01:44.4930808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 130409 2022-11-23T03:01:44.4931244Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 130410 2022-11-23T03:01:44.4931702Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 130411 2022-11-23T03:01:44.4932323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4932768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4933361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4933839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4934427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4934863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4935444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4935914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4936495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4936928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4937495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4937964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4938520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4938969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4939599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4940073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4940509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.4941002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.4941491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.4941979Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.4942621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4943316Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4944369Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4945052Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4945570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.4946045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.4946513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.4946963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.4947312Z dist init r=2, world=4 2022-11-23T03:01:44.4947562Z dist init r=0, world=4 2022-11-23T03:01:44.4947800Z dist init r=3, world=4 2022-11-23T03:01:44.4948045Z dist init r=1, world=4 2022-11-23T03:01:44.4948284Z ok (4.919s) 2022-11-23T03:01:44.4948782Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 130709 2022-11-23T03:01:44.4949401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 130710 2022-11-23T03:01:44.4949847Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 130711 2022-11-23T03:01:44.4950287Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 130712 2022-11-23T03:01:44.4950883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4951335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4951907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4952390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4952948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4953397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4953967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4954416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4954991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4955437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4956002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4956452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4957110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4957566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4958140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4958593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4959047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.4959542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.4960013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.4960497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.4961223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4961916Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4962585Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4963265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4963785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.4964258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.4964708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.4965176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.4965532Z dist init r=1, world=4 2022-11-23T03:01:44.4965768Z dist init r=2, world=4 2022-11-23T03:01:44.4966016Z dist init r=3, world=4 2022-11-23T03:01:44.4966258Z dist init r=0, world=4 2022-11-23T03:01:44.4966475Z ok (4.819s) 2022-11-23T03:01:44.4966980Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 131010 2022-11-23T03:01:44.4967587Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 131011 2022-11-23T03:01:44.4968035Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 131012 2022-11-23T03:01:44.4968460Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 131013 2022-11-23T03:01:44.4969073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4969533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4970089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4970566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4971143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4971590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4972142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4972608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4973179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4973677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4974237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4974702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4975275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4975701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4976267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4976729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4977203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.4977755Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.4978241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.4978730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.4979382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4980057Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4980738Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4981423Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4981918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.4982395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.4982854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.4983322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.4983662Z dist init r=0, world=4 2022-11-23T03:01:44.4984190Z dist init r=3, world=4 2022-11-23T03:01:44.4984450Z dist init r=1, world=4 2022-11-23T03:01:44.4984678Z dist init r=2, world=4 2022-11-23T03:01:44.4984911Z ok (4.818s) 2022-11-23T03:01:44.4985414Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 539 2022-11-23T03:01:44.4986007Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 540 2022-11-23T03:01:44.4986437Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 541 2022-11-23T03:01:44.4986867Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 542 2022-11-23T03:01:44.4987484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4987924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4988499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4988974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4989551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4989982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4990548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4991099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4991669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4992112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4992683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4993145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4993701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.4994149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.4994717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.4995254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.4995689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.4996188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.4996677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.4997145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.4997796Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4998484Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4999170Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.4999839Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5000360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5000831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5001298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5001753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5002104Z dist init r=3, world=4 2022-11-23T03:01:44.5002354Z dist init r=1, world=4 2022-11-23T03:01:44.5002581Z dist init r=0, world=4 2022-11-23T03:01:44.5002826Z dist init r=2, world=4 2022-11-23T03:01:44.5003060Z ok (4.919s) 2022-11-23T03:01:44.5003556Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 840 2022-11-23T03:01:44.5004168Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 841 2022-11-23T03:01:44.5004607Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 842 2022-11-23T03:01:44.5005044Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 843 2022-11-23T03:01:44.5005634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5006083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5006655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5007111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5007688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5008190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5008769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5009218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5009790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5010233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5010799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5011245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5011817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5012315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5012869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5013335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5013783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5014278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5014748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5015232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5015881Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5016577Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5017250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5017929Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5018449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5018904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5019374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5019834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5020189Z dist init r=2, world=4 2022-11-23T03:01:44.5020428Z dist init r=0, world=4 2022-11-23T03:01:44.5020674Z dist init r=3, world=4 2022-11-23T03:01:44.5021022Z dist init r=1, world=4 2022-11-23T03:01:44.5021242Z ok (4.919s) 2022-11-23T03:01:44.5021656Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5022406Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1141 2022-11-23T03:01:44.5022947Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1142 2022-11-23T03:01:44.5023376Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1143 2022-11-23T03:01:44.5023810Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1144 2022-11-23T03:01:44.5024678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5025119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5025785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5026269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5026851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5027283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5027850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5028315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5028929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5029376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5030082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5030546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5031103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5031552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5032123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5032567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5033017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5033518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5034011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5034484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5035133Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5035818Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5036500Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5037157Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5037678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5038148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5038626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5039078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5039429Z dist init r=3, world=4 2022-11-23T03:01:44.5039678Z dist init r=0, world=4 2022-11-23T03:01:44.5039905Z dist init r=1, world=4 2022-11-23T03:01:44.5040148Z dist init r=2, world=4 2022-11-23T03:01:44.5040379Z ok (4.920s) 2022-11-23T03:01:44.5040770Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5041511Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1442 2022-11-23T03:01:44.5042053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1443 2022-11-23T03:01:44.5042500Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1444 2022-11-23T03:01:44.5043027Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1445 2022-11-23T03:01:44.5043635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5044085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5044641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5045111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5045697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5046123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5046687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5047205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5047763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5048200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5048764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5049227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5049779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5050219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5050781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5051231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5051682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5052179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5052665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5053133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5053783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5054463Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5055140Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5055808Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5056383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5056849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5057315Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5057759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5058107Z dist init r=3, world=4 2022-11-23T03:01:44.5058357Z dist init r=1, world=4 2022-11-23T03:01:44.5058587Z dist init r=2, world=4 2022-11-23T03:01:44.5058828Z dist init r=0, world=4 2022-11-23T03:01:44.5059063Z ok (4.919s) 2022-11-23T03:01:44.5059460Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5060255Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1743 2022-11-23T03:01:44.5060798Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1744 2022-11-23T03:01:44.5061240Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1745 2022-11-23T03:01:44.5061660Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1746 2022-11-23T03:01:44.5062262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5062713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5063267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5063736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5064673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5065121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5065671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5066137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5066708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5067133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5067700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5068162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5068737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5069164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5069730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5070191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5070640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5071155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5071641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5072123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5072755Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5073451Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5074133Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5074809Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5075307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5075776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5076240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5076707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5077050Z dist init r=2, world=4 2022-11-23T03:01:44.5077423Z dist init r=0, world=4 2022-11-23T03:01:44.5077681Z dist init r=1, world=4 2022-11-23T03:01:44.5077907Z dist init r=3, world=4 2022-11-23T03:01:44.5078139Z ok (5.019s) 2022-11-23T03:01:44.5078587Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5079320Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2044 2022-11-23T03:01:44.5079838Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2045 2022-11-23T03:01:44.5080279Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2046 2022-11-23T03:01:44.5080713Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2047 2022-11-23T03:01:44.5081304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5081865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5082424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5082857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5083425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5083895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5084587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5085036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5085611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5086067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5086640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5087084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5087656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5088100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5088647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5089106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5089554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5090045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5090520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5091007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5091658Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5092339Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5093002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5093681Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5094201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5094704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5095179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5095638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5096230Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5096603Z return func(*args, **kwargs) 2022-11-23T03:01:44.5097128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5097515Z _check_comm_hook( 2022-11-23T03:01:44.5098000Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5098471Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5099079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5099462Z traceback.print_stack() 2022-11-23T03:01:44.5099936Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5100316Z return func(*args, **kwargs) 2022-11-23T03:01:44.5100835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5101199Z _check_comm_hook( 2022-11-23T03:01:44.5101701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5102173Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5102728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5103094Z traceback.print_stack() 2022-11-23T03:01:44.5103584Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5104160Z return func(*args, **kwargs) 2022-11-23T03:01:44.5104674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5105058Z _check_comm_hook( 2022-11-23T03:01:44.5105555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5106024Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5106560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5106940Z traceback.print_stack() 2022-11-23T03:01:44.5107428Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5107792Z return func(*args, **kwargs) 2022-11-23T03:01:44.5108314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5108697Z _check_comm_hook( 2022-11-23T03:01:44.5109193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5109642Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5110190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5110570Z traceback.print_stack() 2022-11-23T03:01:44.5111043Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5111419Z return func(*args, **kwargs) 2022-11-23T03:01:44.5111939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5112399Z _check_comm_hook( 2022-11-23T03:01:44.5112873Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5113251Z return func(*args, **kwargs) 2022-11-23T03:01:44.5113759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5114114Z p_assert( 2022-11-23T03:01:44.5114609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5114992Z _check_comm_hook( 2022-11-23T03:01:44.5115444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5115821Z traceback.print_stack() 2022-11-23T03:01:44.5116332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5116774Z p_assert( 2022-11-23T03:01:44.5117228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5117605Z traceback.print_stack() 2022-11-23T03:01:44.5118092Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5118453Z return func(*args, **kwargs) 2022-11-23T03:01:44.5118968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5119351Z _check_comm_hook( 2022-11-23T03:01:44.5119845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5120200Z p_assert( 2022-11-23T03:01:44.5120659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5121039Z traceback.print_stack() 2022-11-23T03:01:44.5121515Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5121892Z return func(*args, **kwargs) 2022-11-23T03:01:44.5122411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5122778Z _check_comm_hook( 2022-11-23T03:01:44.5123273Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5123644Z p_assert( 2022-11-23T03:01:44.5124101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5124463Z traceback.print_stack() 2022-11-23T03:01:44.5124725Z dist init r=1, world=4 2022-11-23T03:01:44.5125006Z Communication hook should not be `None` 2022-11-23T03:01:44.5125313Z Communication hook state should not be `None` 2022-11-23T03:01:44.5125602Z dist init r=3, world=4 2022-11-23T03:01:44.5125878Z Communication hook should not be `None` 2022-11-23T03:01:44.5126186Z Communication hook state should not be `None` 2022-11-23T03:01:44.5126474Z dist init r=0, world=4 2022-11-23T03:01:44.5126753Z Communication hook should not be `None` 2022-11-23T03:01:44.5127054Z Communication hook state should not be `None` 2022-11-23T03:01:44.5127339Z dist init r=2, world=4 2022-11-23T03:01:44.5127617Z Communication hook should not be `None` 2022-11-23T03:01:44.5127919Z Communication hook state should not be `None` 2022-11-23T03:01:44.5128195Z ok (4.920s) 2022-11-23T03:01:44.5128700Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5129449Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2345 2022-11-23T03:01:44.5129950Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2346 2022-11-23T03:01:44.5130449Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2347 2022-11-23T03:01:44.5130896Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2348 2022-11-23T03:01:44.5131484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5131937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5132507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5132977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5133537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5133984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5134551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5135067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5135629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5136721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5137303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5137756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5138329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5138774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5139341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5139800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5140251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5140743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5141232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5141704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5142351Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5143036Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5143705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5144659Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5145183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5145651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5146105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5146571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5147162Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5147544Z return func(*args, **kwargs) 2022-11-23T03:01:44.5148050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5148557Z _check_comm_hook( 2022-11-23T03:01:44.5149073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5149525Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5150074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5150453Z traceback.print_stack() 2022-11-23T03:01:44.5150942Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5151304Z return func(*args, **kwargs) 2022-11-23T03:01:44.5151830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5152215Z _check_comm_hook( 2022-11-23T03:01:44.5152695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5153244Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5153801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5154179Z traceback.print_stack() 2022-11-23T03:01:44.5154655Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5155032Z return func(*args, **kwargs) 2022-11-23T03:01:44.5155550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5155921Z _check_comm_hook( 2022-11-23T03:01:44.5156417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5156886Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5157445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5157807Z traceback.print_stack() 2022-11-23T03:01:44.5158299Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5158676Z return func(*args, **kwargs) 2022-11-23T03:01:44.5159176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5159561Z _check_comm_hook( 2022-11-23T03:01:44.5160060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5160511Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5161065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5161452Z traceback.print_stack() 2022-11-23T03:01:44.5161950Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5162311Z return func(*args, **kwargs) 2022-11-23T03:01:44.5162830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5163208Z _check_comm_hook( 2022-11-23T03:01:44.5163688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5164059Z p_assert( 2022-11-23T03:01:44.5164519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5164893Z traceback.print_stack() 2022-11-23T03:01:44.5165368Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5165742Z return func(*args, **kwargs) 2022-11-23T03:01:44.5166316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5166695Z _check_comm_hook( 2022-11-23T03:01:44.5167198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5167570Z p_assert( 2022-11-23T03:01:44.5168013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5168394Z traceback.print_stack() 2022-11-23T03:01:44.5168883Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5169260Z return func(*args, **kwargs) 2022-11-23T03:01:44.5169763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5170145Z _check_comm_hook( 2022-11-23T03:01:44.5170703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5171059Z p_assert( 2022-11-23T03:01:44.5171517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5171895Z traceback.print_stack() 2022-11-23T03:01:44.5172382Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5172742Z return func(*args, **kwargs) 2022-11-23T03:01:44.5173258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5173640Z _check_comm_hook( 2022-11-23T03:01:44.5174120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5174493Z p_assert( 2022-11-23T03:01:44.5174963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5175335Z traceback.print_stack() 2022-11-23T03:01:44.5175610Z dist init r=3, world=4 2022-11-23T03:01:44.5175901Z Communication hook should not be `None` 2022-11-23T03:01:44.5176235Z Communication hook state should not be `None` 2022-11-23T03:01:44.5176510Z dist init r=1, world=4 2022-11-23T03:01:44.5176792Z Communication hook should not be `None` 2022-11-23T03:01:44.5177112Z Communication hook state should not be `None` 2022-11-23T03:01:44.5177437Z dist init r=2, world=4 2022-11-23T03:01:44.5177714Z Communication hook should not be `None` 2022-11-23T03:01:44.5178034Z Communication hook state should not be `None` 2022-11-23T03:01:44.5178303Z dist init r=0, world=4 2022-11-23T03:01:44.5178581Z Communication hook should not be `None` 2022-11-23T03:01:44.5178900Z Communication hook state should not be `None` 2022-11-23T03:01:44.5179159Z ok (4.920s) 2022-11-23T03:01:44.5179611Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5180377Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2646 2022-11-23T03:01:44.5180899Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2647 2022-11-23T03:01:44.5181329Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2648 2022-11-23T03:01:44.5181769Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2649 2022-11-23T03:01:44.5182370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5182806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5183370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5183842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5184785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5185225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5185794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5186256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5186812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5187251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5187814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5188273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5188932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5189381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5189947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5190408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5190840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5191334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5191820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5192290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5192942Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5193633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5194316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5194980Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5195501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5195973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5196438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5196887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5197479Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5197869Z return func(*args, **kwargs) 2022-11-23T03:01:44.5198377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5198768Z _check_comm_hook( 2022-11-23T03:01:44.5199270Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5199738Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5200274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5200653Z traceback.print_stack() 2022-11-23T03:01:44.5201143Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5201508Z return func(*args, **kwargs) 2022-11-23T03:01:44.5202076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5202468Z _check_comm_hook( 2022-11-23T03:01:44.5202969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5203422Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5203974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5204357Z traceback.print_stack() 2022-11-23T03:01:44.5204828Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5205207Z return func(*args, **kwargs) 2022-11-23T03:01:44.5205730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5206173Z _check_comm_hook( 2022-11-23T03:01:44.5206659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5207130Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5207684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5208044Z traceback.print_stack() 2022-11-23T03:01:44.5208533Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5208911Z return func(*args, **kwargs) 2022-11-23T03:01:44.5209413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5209800Z _check_comm_hook( 2022-11-23T03:01:44.5210300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5210774Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5211308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5211689Z traceback.print_stack() 2022-11-23T03:01:44.5212175Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5212536Z return func(*args, **kwargs) 2022-11-23T03:01:44.5213054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5213437Z _check_comm_hook( 2022-11-23T03:01:44.5213933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5214289Z p_assert( 2022-11-23T03:01:44.5214749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5215195Z traceback.print_stack() 2022-11-23T03:01:44.5215671Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5216046Z return func(*args, **kwargs) 2022-11-23T03:01:44.5216561Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5216941Z _check_comm_hook( 2022-11-23T03:01:44.5217422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5217790Z p_assert( 2022-11-23T03:01:44.5218249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5218610Z traceback.print_stack() 2022-11-23T03:01:44.5219095Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5219530Z return func(*args, **kwargs) 2022-11-23T03:01:44.5220048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5220433Z _check_comm_hook( 2022-11-23T03:01:44.5220930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5221302Z p_assert( 2022-11-23T03:01:44.5221753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5222137Z traceback.print_stack() 2022-11-23T03:01:44.5222633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5223002Z return func(*args, **kwargs) 2022-11-23T03:01:44.5223534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5224357Z _check_comm_hook( 2022-11-23T03:01:44.5224886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5225247Z p_assert( 2022-11-23T03:01:44.5225718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5226107Z traceback.print_stack() 2022-11-23T03:01:44.5226359Z dist init r=0, world=4 2022-11-23T03:01:44.5226650Z Communication hook should not be `None` 2022-11-23T03:01:44.5226986Z Communication hook state should not be `None` 2022-11-23T03:01:44.5227260Z dist init r=2, world=4 2022-11-23T03:01:44.5227537Z Communication hook should not be `None` 2022-11-23T03:01:44.5227858Z Communication hook state should not be `None` 2022-11-23T03:01:44.5228128Z dist init r=1, world=4 2022-11-23T03:01:44.5228402Z Communication hook should not be `None` 2022-11-23T03:01:44.5228778Z Communication hook state should not be `None` 2022-11-23T03:01:44.5229059Z dist init r=3, world=4 2022-11-23T03:01:44.5229347Z Communication hook should not be `None` 2022-11-23T03:01:44.5229681Z Communication hook state should not be `None` 2022-11-23T03:01:44.5229944Z ok (4.920s) 2022-11-23T03:01:44.5230401Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5231158Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2947 2022-11-23T03:01:44.5231685Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2948 2022-11-23T03:01:44.5232113Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2949 2022-11-23T03:01:44.5232552Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2950 2022-11-23T03:01:44.5233165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5233632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5234195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5234675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5235257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5235691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5236263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5236737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5237311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5237742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5238400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5238872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5239428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5239871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5240443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5240915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5241350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5241853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5242414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5242909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5243555Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5244252Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5244941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5245624Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5246130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5246619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5247099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5247555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5248157Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5248549Z return func(*args, **kwargs) 2022-11-23T03:01:44.5249082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5249461Z _check_comm_hook( 2022-11-23T03:01:44.5249966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5250447Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5250992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5251381Z traceback.print_stack() 2022-11-23T03:01:44.5251876Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5252261Z return func(*args, **kwargs) 2022-11-23T03:01:44.5252769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5253161Z _check_comm_hook( 2022-11-23T03:01:44.5253668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5254126Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5254689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5255084Z traceback.print_stack() 2022-11-23T03:01:44.5255641Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5256016Z return func(*args, **kwargs) 2022-11-23T03:01:44.5256547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5256943Z _check_comm_hook( 2022-11-23T03:01:44.5257429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5257900Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5258450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5258828Z traceback.print_stack() 2022-11-23T03:01:44.5259301Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5259735Z return func(*args, **kwargs) 2022-11-23T03:01:44.5260260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5260628Z _check_comm_hook( 2022-11-23T03:01:44.5261126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5261593Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5262142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5262505Z traceback.print_stack() 2022-11-23T03:01:44.5262991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5263368Z return func(*args, **kwargs) 2022-11-23T03:01:44.5264076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5264491Z _check_comm_hook( 2022-11-23T03:01:44.5264994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5265349Z p_assert( 2022-11-23T03:01:44.5265806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5266182Z traceback.print_stack() 2022-11-23T03:01:44.5266672Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5267033Z return func(*args, **kwargs) 2022-11-23T03:01:44.5267554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5267938Z _check_comm_hook( 2022-11-23T03:01:44.5268421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5268793Z p_assert( 2022-11-23T03:01:44.5269257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5269637Z traceback.print_stack() 2022-11-23T03:01:44.5270111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5270486Z return func(*args, **kwargs) 2022-11-23T03:01:44.5271007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5271378Z _check_comm_hook( 2022-11-23T03:01:44.5271877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5272248Z p_assert( 2022-11-23T03:01:44.5272691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5273068Z traceback.print_stack() 2022-11-23T03:01:44.5273650Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5274034Z return func(*args, **kwargs) 2022-11-23T03:01:44.5274540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5274927Z _check_comm_hook( 2022-11-23T03:01:44.5275423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5275777Z p_assert( 2022-11-23T03:01:44.5276240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5276615Z traceback.print_stack() 2022-11-23T03:01:44.5276860Z dist init r=2, world=4 2022-11-23T03:01:44.5277141Z Communication hook should not be `None` 2022-11-23T03:01:44.5277512Z Communication hook state should not be `None` 2022-11-23T03:01:44.5277877Z dist init r=0, world=4 2022-11-23T03:01:44.5278137Z Communication hook should not be `None` 2022-11-23T03:01:44.5278463Z Communication hook state should not be `None` 2022-11-23T03:01:44.5278752Z dist init r=3, world=4 2022-11-23T03:01:44.5279011Z Communication hook should not be `None` 2022-11-23T03:01:44.5279328Z Communication hook state should not be `None` 2022-11-23T03:01:44.5279610Z dist init r=1, world=4 2022-11-23T03:01:44.5279867Z Communication hook should not be `None` 2022-11-23T03:01:44.5280190Z Communication hook state should not be `None` 2022-11-23T03:01:44.5280464Z ok (4.919s) 2022-11-23T03:01:44.5280890Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5281638Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3248 2022-11-23T03:01:44.5282160Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3249 2022-11-23T03:01:44.5282611Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3250 2022-11-23T03:01:44.5283033Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3251 2022-11-23T03:01:44.5283631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5284080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5284632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5285101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5285677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5286123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5286674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5287150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5287725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5288150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5288715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5289179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5289750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5290174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5290739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5291254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5312644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5313190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5313706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5314213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5314924Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5315630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5316320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5317212Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5317726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5318215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5318696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5319159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5319758Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5320153Z return func(*args, **kwargs) 2022-11-23T03:01:44.5320695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5321210Z _check_comm_hook( 2022-11-23T03:01:44.5321740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5322230Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5322779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5323165Z traceback.print_stack() 2022-11-23T03:01:44.5323669Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5324053Z return func(*args, **kwargs) 2022-11-23T03:01:44.5324570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5324959Z _check_comm_hook( 2022-11-23T03:01:44.5325472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5325944Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5326509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5326897Z traceback.print_stack() 2022-11-23T03:01:44.5327394Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5327761Z return func(*args, **kwargs) 2022-11-23T03:01:44.5328296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5328741Z _check_comm_hook( 2022-11-23T03:01:44.5329239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5329723Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5330379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5330777Z traceback.print_stack() 2022-11-23T03:01:44.5331264Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5331650Z return func(*args, **kwargs) 2022-11-23T03:01:44.5332185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5332559Z _check_comm_hook( 2022-11-23T03:01:44.5333068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5333549Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5334092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5334549Z traceback.print_stack() 2022-11-23T03:01:44.5335054Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5335437Z return func(*args, **kwargs) 2022-11-23T03:01:44.5335917Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5336290Z return func(*args, **kwargs) 2022-11-23T03:01:44.5336813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5337183Z _check_comm_hook( 2022-11-23T03:01:44.5337697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5338088Z _check_comm_hook( 2022-11-23T03:01:44.5338592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5338958Z p_assert( 2022-11-23T03:01:44.5339460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5339833Z p_assert( 2022-11-23T03:01:44.5340285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5340668Z traceback.print_stack() 2022-11-23T03:01:44.5341160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5341524Z traceback.print_stack() 2022-11-23T03:01:44.5342021Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5342401Z return func(*args, **kwargs) 2022-11-23T03:01:44.5342928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5343299Z _check_comm_hook( 2022-11-23T03:01:44.5343813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5344436Z p_assert( 2022-11-23T03:01:44.5344905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5345292Z traceback.print_stack() 2022-11-23T03:01:44.5345791Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5346177Z return func(*args, **kwargs) 2022-11-23T03:01:44.5346689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5347076Z _check_comm_hook( 2022-11-23T03:01:44.5347585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5347941Z p_assert( 2022-11-23T03:01:44.5348407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5348876Z traceback.print_stack() 2022-11-23T03:01:44.5349139Z dist init r=1, world=4 2022-11-23T03:01:44.5349428Z Communication hook should not be `None` 2022-11-23T03:01:44.5349757Z Communication hook state should not be `None` 2022-11-23T03:01:44.5350032Z dist init r=3, world=4 2022-11-23T03:01:44.5350310Z Communication hook should not be `None` 2022-11-23T03:01:44.5350637Z Communication hook state should not be `None` 2022-11-23T03:01:44.5350926Z dist init r=2, world=4 2022-11-23T03:01:44.5351187Z Communication hook should not be `None` 2022-11-23T03:01:44.5351509Z Communication hook state should not be `None` 2022-11-23T03:01:44.5351798Z dist init r=0, world=4 2022-11-23T03:01:44.5352059Z Communication hook should not be `None` 2022-11-23T03:01:44.5352382Z Communication hook state should not be `None` 2022-11-23T03:01:44.5352660Z ok (4.920s) 2022-11-23T03:01:44.5353103Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5353959Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3549 2022-11-23T03:01:44.5354492Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3550 2022-11-23T03:01:44.5354954Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3551 2022-11-23T03:01:44.5355394Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3552 2022-11-23T03:01:44.5356005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5356463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5357032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5357514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5358103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5358558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5359120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5359589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5360168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5360597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5361173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5361639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5362221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5362655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5363240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5363707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5364165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5364659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5365162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5365669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5366425Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5367142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5367839Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5368536Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5369048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5369527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5370002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5370482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5371131Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5371521Z return func(*args, **kwargs) 2022-11-23T03:01:44.5372116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5372494Z _check_comm_hook( 2022-11-23T03:01:44.5373005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5373484Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5374047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5374416Z traceback.print_stack() 2022-11-23T03:01:44.5374914Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5375300Z return func(*args, **kwargs) 2022-11-23T03:01:44.5375819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5376214Z _check_comm_hook( 2022-11-23T03:01:44.5376724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5377207Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5377816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5378207Z traceback.print_stack() 2022-11-23T03:01:44.5378708Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5379075Z return func(*args, **kwargs) 2022-11-23T03:01:44.5379610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5380003Z _check_comm_hook( 2022-11-23T03:01:44.5380522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5380983Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5381539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5381924Z traceback.print_stack() 2022-11-23T03:01:44.5382406Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5382791Z return func(*args, **kwargs) 2022-11-23T03:01:44.5383320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5383689Z _check_comm_hook( 2022-11-23T03:01:44.5384484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T03:01:44.5385054Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T03:01:44.5385633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5385999Z traceback.print_stack() 2022-11-23T03:01:44.5386497Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5386879Z return func(*args, **kwargs) 2022-11-23T03:01:44.5387397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5387786Z _check_comm_hook( 2022-11-23T03:01:44.5388294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5388670Z p_assert( 2022-11-23T03:01:44.5389122Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5389579Z return func(*args, **kwargs) 2022-11-23T03:01:44.5390076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5390441Z traceback.print_stack() 2022-11-23T03:01:44.5390972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5391361Z _check_comm_hook( 2022-11-23T03:01:44.5391867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5392227Z p_assert( 2022-11-23T03:01:44.5392694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5393076Z traceback.print_stack() 2022-11-23T03:01:44.5393558Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5393949Z return func(*args, **kwargs) 2022-11-23T03:01:44.5394481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5394856Z _check_comm_hook( 2022-11-23T03:01:44.5395364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5395744Z p_assert( 2022-11-23T03:01:44.5396211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5396577Z traceback.print_stack() 2022-11-23T03:01:44.5397074Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:01:44.5397459Z return func(*args, **kwargs) 2022-11-23T03:01:44.5397968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T03:01:44.5398359Z _check_comm_hook( 2022-11-23T03:01:44.5398868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T03:01:44.5399246Z p_assert( 2022-11-23T03:01:44.5399694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:01:44.5400078Z traceback.print_stack() 2022-11-23T03:01:44.5400344Z dist init r=3, world=4 2022-11-23T03:01:44.5400613Z Communication hook should not be `None` 2022-11-23T03:01:44.5400940Z Communication hook state should not be `None` 2022-11-23T03:01:44.5401230Z dist init r=2, world=4 2022-11-23T03:01:44.5401493Z Communication hook should not be `None` 2022-11-23T03:01:44.5401818Z Communication hook state should not be `None` 2022-11-23T03:01:44.5402106Z dist init r=0, world=4 2022-11-23T03:01:44.5402368Z Communication hook should not be `None` 2022-11-23T03:01:44.5402693Z Communication hook state should not be `None` 2022-11-23T03:01:44.5402985Z dist init r=1, world=4 2022-11-23T03:01:44.5403344Z Communication hook should not be `None` 2022-11-23T03:01:44.5403682Z Communication hook state should not be `None` 2022-11-23T03:01:44.5403959Z ok (4.920s) 2022-11-23T03:01:44.5404468Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3850 2022-11-23T03:01:44.5405091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3851 2022-11-23T03:01:44.5405547Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3852 2022-11-23T03:01:44.5406000Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3853 2022-11-23T03:01:44.5406611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5407069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5407714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5408198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5408770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5409221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5409794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5410255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5410840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5411292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5411872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5412328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5412908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5413357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5413934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5414387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5414849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5415358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5415847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5416099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5416506Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5416907Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5417298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5417690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5417926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5418156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5418441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5418660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5418775Z dist init r=2, world=4 2022-11-23T03:01:44.5418886Z dist init r=0, world=4 2022-11-23T03:01:44.5418994Z dist init r=1, world=4 2022-11-23T03:01:44.5419102Z dist init r=3, world=4 2022-11-23T03:01:44.5419202Z ok (4.919s) 2022-11-23T03:01:44.5419597Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4151 2022-11-23T03:01:44.5419798Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4152 2022-11-23T03:01:44.5420016Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4153 2022-11-23T03:01:44.5420232Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4154 2022-11-23T03:01:44.5420673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5420853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5421240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5421438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5421814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5421990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5422351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5422541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5422913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5423090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5423470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5423662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5424340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5424528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5424901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5425094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5425349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5425602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5425849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5426093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5426495Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5426896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5427291Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5427713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5428027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5428274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5428506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5428794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5428906Z dist init r=0, world=4 2022-11-23T03:01:44.5429015Z dist init r=1, world=4 2022-11-23T03:01:44.5429126Z dist init r=3, world=4 2022-11-23T03:01:44.5429216Z dist init r=2, world=4 2022-11-23T03:01:44.5429317Z ok (4.919s) 2022-11-23T03:01:44.5429717Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4452 2022-11-23T03:01:44.5430003Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4453 2022-11-23T03:01:44.5430222Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4454 2022-11-23T03:01:44.5430439Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4455 2022-11-23T03:01:44.5430819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5430997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5431363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5431558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5431924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5432099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5432481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5432672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5433035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5433211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5433581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5433751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5434112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5434286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5434674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5434861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5435110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5435356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5435599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5435843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5436233Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5436637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5437079Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5437484Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5437719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5437952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5438181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5438412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5438523Z dist init r=2, world=4 2022-11-23T03:01:44.5438614Z dist init r=1, world=4 2022-11-23T03:01:44.5438721Z dist init r=0, world=4 2022-11-23T03:01:44.5438828Z dist init r=3, world=4 2022-11-23T03:01:44.5438977Z ok (4.918s) 2022-11-23T03:01:44.5439378Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4753 2022-11-23T03:01:44.5439597Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4754 2022-11-23T03:01:44.5439815Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4755 2022-11-23T03:01:44.5440014Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4756 2022-11-23T03:01:44.5440392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5440569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5440952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5441148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5441520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5441699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5442074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5442266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5442614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5442788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5443162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5443352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5443724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5443897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5444277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5444466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5444698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5444946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5445196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5445439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5445891Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5446304Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5446701Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5447095Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5447327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5447541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5447772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5448001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5448179Z dist init r=3, world=4 2022-11-23T03:01:44.5448288Z dist init r=1, world=4 2022-11-23T03:01:44.5448397Z dist init r=2, world=4 2022-11-23T03:01:44.5448502Z dist init r=0, world=4 2022-11-23T03:01:44.5448601Z ok (4.918s) 2022-11-23T03:01:44.5448977Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5054 2022-11-23T03:01:44.5449197Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5055 2022-11-23T03:01:44.5449414Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5056 2022-11-23T03:01:44.5449631Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5057 2022-11-23T03:01:44.5450010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5450192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5450582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5450776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5451125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5451299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5451674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5451863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5452228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5452403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5452782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5452970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5453330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5453486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5453856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5454048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5454297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5454543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5454838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5455092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5455499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5455894Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5456267Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5456656Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5456889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5457165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5457398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5457628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5457740Z dist init r=1, world=4 2022-11-23T03:01:44.5457848Z dist init r=3, world=4 2022-11-23T03:01:44.5457938Z dist init r=2, world=4 2022-11-23T03:01:44.5458044Z dist init r=0, world=4 2022-11-23T03:01:44.5458140Z ok (4.918s) 2022-11-23T03:01:44.5458537Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5355 2022-11-23T03:01:44.5458756Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5356 2022-11-23T03:01:44.5458972Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5357 2022-11-23T03:01:44.5459191Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5358 2022-11-23T03:01:44.5459575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5459737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5460121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5460312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5460681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5460854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5461234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5461428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5461799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5461956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5462336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5462528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5462891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5463063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5463440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5463628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5464290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5464563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5464792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5465037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5465449Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5465852Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5466248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5466706Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5466943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5467175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5467408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5467619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5467732Z dist init r=1, world=4 2022-11-23T03:01:44.5467841Z dist init r=3, world=4 2022-11-23T03:01:44.5467952Z dist init r=2, world=4 2022-11-23T03:01:44.5468059Z dist init r=0, world=4 2022-11-23T03:01:44.5468156Z ok (4.918s) 2022-11-23T03:01:44.5468436Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5468863Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5656 2022-11-23T03:01:44.5469084Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5657 2022-11-23T03:01:44.5469300Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5658 2022-11-23T03:01:44.5469510Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5659 2022-11-23T03:01:44.5469889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5470065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5470447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5470640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5471006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5471172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5471547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5471736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5472105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5472278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5472650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5472839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5473204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5473427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5473810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5474001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5474250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5474497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5474742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5474980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5475382Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5475839Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5476215Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5476609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5476845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5477078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5477309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5477595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5477709Z dist init r=3, world=4 2022-11-23T03:01:44.5477824Z dist init r=0, world=4 2022-11-23T03:01:44.5477932Z dist init r=2, world=4 2022-11-23T03:01:44.5478026Z dist init r=1, world=4 2022-11-23T03:01:44.5478127Z ok (4.417s) 2022-11-23T03:01:44.5478404Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5478845Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5941 2022-11-23T03:01:44.5479067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5942 2022-11-23T03:01:44.5479291Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5943 2022-11-23T03:01:44.5479506Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5944 2022-11-23T03:01:44.5479866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5480044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5480434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5480631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5480997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5481172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5481548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5481741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5482105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5482261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5482684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5482883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5483256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5483431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5483802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5483993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5484246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5484496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5484723Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5485019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5485423Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5485827Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5486226Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5486626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5486858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5487090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5487327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5487538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5487648Z dist init r=3, world=4 2022-11-23T03:01:44.5487757Z dist init r=0, world=4 2022-11-23T03:01:44.5487863Z dist init r=1, world=4 2022-11-23T03:01:44.5487969Z dist init r=2, world=4 2022-11-23T03:01:44.5488067Z ok (4.317s) 2022-11-23T03:01:44.5488346Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5488764Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6226 2022-11-23T03:01:44.5488984Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6227 2022-11-23T03:01:44.5489201Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6228 2022-11-23T03:01:44.5489424Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6229 2022-11-23T03:01:44.5489807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5489983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5490363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5490558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5490926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5491082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5491458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5491652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5492068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5492253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5492638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5492834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5493200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5493358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5493731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5493924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5494223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5494473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5494716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5494959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5495366Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5495764Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5496146Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5496550Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5496787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5497017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5497245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5497468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5497578Z dist init r=3, world=4 2022-11-23T03:01:44.5497686Z dist init r=0, world=4 2022-11-23T03:01:44.5497776Z dist init r=1, world=4 2022-11-23T03:01:44.5497886Z dist init r=2, world=4 2022-11-23T03:01:44.5497984Z ok (4.317s) 2022-11-23T03:01:44.5498270Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5498712Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6511 2022-11-23T03:01:44.5498931Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6512 2022-11-23T03:01:44.5499147Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6513 2022-11-23T03:01:44.5499362Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6514 2022-11-23T03:01:44.5499725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5499903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5500284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5500477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5500898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5501079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5501459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5501653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5502020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5502176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5502542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5502719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5503099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5503349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5503728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5504165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5504424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5504656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5504903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5505144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5505558Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5505969Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5506371Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5506766Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5506999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5507221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5507424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5507655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5507769Z dist init r=3, world=4 2022-11-23T03:01:44.5507876Z dist init r=0, world=4 2022-11-23T03:01:44.5507987Z dist init r=2, world=4 2022-11-23T03:01:44.5508094Z dist init r=1, world=4 2022-11-23T03:01:44.5508192Z ok (4.317s) 2022-11-23T03:01:44.5508457Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5508893Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6796 2022-11-23T03:01:44.5509114Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6797 2022-11-23T03:01:44.5509333Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6798 2022-11-23T03:01:44.5509550Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6799 2022-11-23T03:01:44.5509926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5510104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5510561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5510767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5511120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5511297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5511673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5511864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5512227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5512401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5512853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5513042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5513413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5513569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5513945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5514136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5514384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5514633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5514884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5515126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5515533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5515918Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5516315Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5516703Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5516935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5517164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5517396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5517624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5517737Z dist init r=3, world=4 2022-11-23T03:01:44.5517845Z dist init r=0, world=4 2022-11-23T03:01:44.5517936Z dist init r=2, world=4 2022-11-23T03:01:44.5518043Z dist init r=1, world=4 2022-11-23T03:01:44.5518140Z ok (4.418s) 2022-11-23T03:01:44.5518431Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T03:01:44.5518866Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7081 2022-11-23T03:01:44.5519086Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7082 2022-11-23T03:01:44.5519307Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7083 2022-11-23T03:01:44.5519571Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7084 2022-11-23T03:01:44.5519938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5520120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5520505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5520698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5521064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5521242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5521620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5521866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5522216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5522392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5522768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5522963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5523337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:01:44.5523511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:01:44.5523882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:01:44.5524076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:01:44.5524327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:01:44.5524558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:01:44.5524802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:01:44.5525045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:01:44.5525448Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5525844Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5526240Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5526636Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:01:44.5526871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:01:44.5527095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:01:44.5527308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:01:44.5527536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:01:44.5527650Z dist init r=3, world=4 2022-11-23T03:01:44.5527759Z dist init r=0, world=4 2022-11-23T03:01:44.5527867Z dist init r=1, world=4 2022-11-23T03:01:44.5527974Z dist init r=2, world=4 2022-11-23T03:01:44.5528071Z ok (4.318s) 2022-11-23T03:01:44.5528098Z 2022-11-23T03:01:44.5528348Z ---------------------------------------------------------------------- 2022-11-23T03:01:44.5528469Z Ran 27 tests in 131.033s 2022-11-23T03:01:44.5528534Z 2022-11-23T03:01:44.5528686Z OK 2022-11-23T03:01:44.5528707Z 2022-11-23T03:01:44.5528836Z Generating XML reports... 2022-11-23T03:01:44.5529319Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123025933.xml 2022-11-23T03:01:44.5529340Z 2022-11-23T03:01:44.5529980Z ##[endgroup] 2022-11-23T03:01:44.5530469Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_gtmjrjfq) 2022-11-23T03:01:44.5530490Z 2022-11-23T03:01:44.8289600Z 2022-11-23T03:01:44.8289867Z real 2m18.992s 2022-11-23T03:01:44.8290165Z user 7m35.679s 2022-11-23T03:01:44.8291071Z sys 5m8.999s 2022-11-23T03:01:44.8291528Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:01:44.8292132Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_core.py 2022-11-23T03:01:47.1741295Z Ignoring disabled issues: [] 2022-11-23T03:01:47.2267588Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:01:47.2268658Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:01:47.2269173Z Selected tests: 2022-11-23T03:01:47.2269418Z distributed/fsdp/test_fsdp_core.py 2022-11-23T03:01:47.2293676Z Prioritized test from test file changes. 2022-11-23T03:01:47.2294581Z reordering tests for PR: 2022-11-23T03:01:47.2294931Z prioritized: [] 2022-11-23T03:01:47.2295420Z the rest: ['distributed/fsdp/test_fsdp_core.py'] 2022-11-23T03:01:47.2295644Z 2022-11-23T03:01:47.2296202Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:01:47.2297181Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:01:47.2302003Z parallel (file granularity) tests: 2022-11-23T03:01:47.2302572Z 2022-11-23T03:01:47.2303092Z serial (file granularity) tests: 2022-11-23T03:01:47.2303740Z distributed/fsdp/test_fsdp_core.py 2022-11-23T03:01:49.5302758Z Ignoring disabled issues: [] 2022-11-23T03:01:49.9505298Z Running distributed/fsdp/test_fsdp_core.py ... [2022-11-23 03:01:49.949876] 2022-11-23T03:01:49.9506719Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:01:49.950335] 2022-11-23T03:12:17.9553225Z 2022-11-23T03:12:17.9557510Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_core 2022-11-23T03:12:17.9558507Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_8ncfwqpo) 2022-11-23T03:12:17.9627523Z 2022-11-23T03:12:17.9627939Z Running tests... 2022-11-23T03:12:17.9628517Z ---------------------------------------------------------------------- 2022-11-23T03:12:17.9631624Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_core 2022-11-23T03:12:17.9632133Z test_pre_backward_hook_registration_after_state_dict (__main__.TestHooks) 2022-11-23T03:12:17.9632750Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:12:17.9633268Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7578 2022-11-23T03:12:17.9633714Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7579 2022-11-23T03:12:17.9635576Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7580 2022-11-23T03:12:17.9636033Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7581 2022-11-23T03:12:17.9636685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9637429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9638063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9638554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9641667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9642143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9642774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9643263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9643864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9644537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9645133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9645615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9646198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9646669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9647630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9648483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9649351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9650277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9651191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9652082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9653310Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9654602Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9655894Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9657163Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9658256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9659130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9660059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9661452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9664287Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9665878Z warnings.warn( 2022-11-23T03:12:17.9668146Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9669568Z warnings.warn( 2022-11-23T03:12:17.9671703Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9673100Z warnings.warn( 2022-11-23T03:12:17.9675367Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9676749Z warnings.warn( 2022-11-23T03:12:17.9677188Z dist init r=3, world=4 2022-11-23T03:12:17.9677584Z dist init r=0, world=4 2022-11-23T03:12:17.9677996Z dist init r=1, world=4 2022-11-23T03:12:17.9678394Z dist init r=2, world=4 2022-11-23T03:12:17.9678773Z ok (7.073s) 2022-11-23T03:12:17.9679362Z test_pre_backward_hook_registration_cuda_first_False (__main__.TestHooks) 2022-11-23T03:12:17.9680480Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7879 2022-11-23T03:12:17.9681444Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7880 2022-11-23T03:12:17.9682249Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7881 2022-11-23T03:12:17.9683072Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7882 2022-11-23T03:12:17.9684186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9684929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9685975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9686796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9687836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9688589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9689633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9690471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9691503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9692289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9693302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9694081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9695080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9695914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9697021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9697875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9698638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9699505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9700449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9701319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9702526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9703804Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9705582Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9706833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9707753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9708628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9709446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9710264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9712705Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9714148Z warnings.warn( 2022-11-23T03:12:17.9716323Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9717691Z warnings.warn( 2022-11-23T03:12:17.9719767Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9721219Z warnings.warn( 2022-11-23T03:12:17.9723385Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9724791Z warnings.warn( 2022-11-23T03:12:17.9725252Z dist init r=1, world=4 2022-11-23T03:12:17.9725696Z dist init r=2, world=4 2022-11-23T03:12:17.9726092Z dist init r=3, world=4 2022-11-23T03:12:17.9726657Z dist init r=0, world=4 2022-11-23T03:12:17.9727112Z ok (5.220s) 2022-11-23T03:12:17.9727649Z test_pre_backward_hook_registration_cuda_first_True (__main__.TestHooks) 2022-11-23T03:12:17.9728882Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8180 2022-11-23T03:12:17.9729836Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8181 2022-11-23T03:12:17.9730659Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8182 2022-11-23T03:12:17.9731478Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8183 2022-11-23T03:12:17.9732575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9733411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9734643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9735518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9736542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9737343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9738353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9739167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9740273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9741074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9742099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9742946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9744342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9745135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9746149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9747017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9747822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9748768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9749679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9750594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9751723Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9752944Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9754108Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9755334Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9756262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9757107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9758056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9758932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9759569Z dist init r=0, world=4 2022-11-23T03:12:17.9759992Z dist init r=2, world=4 2022-11-23T03:12:17.9760476Z dist init r=1, world=4 2022-11-23T03:12:17.9760889Z dist init r=3, world=4 2022-11-23T03:12:17.9761268Z ok (5.320s) 2022-11-23T03:12:17.9761873Z test_register_functions_called_cuda_first_False_mixed_precision_False (__main__.TestHooks) 2022-11-23T03:12:17.9762852Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8481 2022-11-23T03:12:17.9763818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8482 2022-11-23T03:12:17.9764620Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8483 2022-11-23T03:12:17.9765581Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8484 2022-11-23T03:12:17.9766707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9767502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9768591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9769503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9770617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9771373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9772456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9773350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9774463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9775254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9776345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9777231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9778392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9779202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9780267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9781127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9781912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9782796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9783713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9784991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9786165Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9787424Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9788734Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9790162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9790713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9791223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9791871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9792631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9793942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9794842Z warnings.warn( 2022-11-23T03:12:17.9796053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9796855Z warnings.warn( 2022-11-23T03:12:17.9798037Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9798838Z warnings.warn( 2022-11-23T03:12:17.9800027Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9800827Z warnings.warn( 2022-11-23T03:12:17.9801069Z dist init r=1, world=4 2022-11-23T03:12:17.9801342Z dist init r=3, world=4 2022-11-23T03:12:17.9801590Z dist init r=0, world=4 2022-11-23T03:12:17.9801835Z dist init r=2, world=4 2022-11-23T03:12:17.9802085Z ok (5.221s) 2022-11-23T03:12:17.9802457Z test_register_functions_called_cuda_first_False_mixed_precision_True (__main__.TestHooks) 2022-11-23T03:12:17.9803013Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8766 2022-11-23T03:12:17.9803571Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8767 2022-11-23T03:12:17.9804031Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8768 2022-11-23T03:12:17.9804471Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8769 2022-11-23T03:12:17.9805101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9805568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9806151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9806612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9807260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9807761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9808382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9808839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9809436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9809902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9810477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9810980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9811650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9812175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9812750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9813237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9813713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9814234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9814715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9815240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9815927Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9816685Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9817609Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9818317Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9818854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9819337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9819814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9820295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9821458Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9822190Z warnings.warn( 2022-11-23T03:12:17.9823210Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9824237Z warnings.warn( 2022-11-23T03:12:17.9825371Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9826527Z warnings.warn( 2022-11-23T03:12:17.9828378Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9829625Z warnings.warn( 2022-11-23T03:12:17.9831684Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9833314Z warnings.warn( 2022-11-23T03:12:17.9835423Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9836743Z warnings.warn( 2022-11-23T03:12:17.9838897Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9840281Z warnings.warn( 2022-11-23T03:12:17.9842324Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:17.9843706Z warnings.warn( 2022-11-23T03:12:17.9844153Z dist init r=0, world=4 2022-11-23T03:12:17.9844600Z dist init r=1, world=4 2022-11-23T03:12:17.9844989Z dist init r=3, world=4 2022-11-23T03:12:17.9845435Z dist init r=2, world=4 2022-11-23T03:12:17.9845833Z ok (5.221s) 2022-11-23T03:12:17.9846398Z test_register_functions_called_cuda_first_True_mixed_precision_False (__main__.TestHooks) 2022-11-23T03:12:17.9847360Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9051 2022-11-23T03:12:17.9848279Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9052 2022-11-23T03:12:17.9849096Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9053 2022-11-23T03:12:17.9849855Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9054 2022-11-23T03:12:17.9850963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9851746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9852885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9853722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9854763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9913467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9914735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9915630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9916721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9917554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9918518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9919602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9920646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9921481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9922483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9923300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9924127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9925012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9925857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9926839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9928061Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9929353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9930563Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9931821Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9932782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9933643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9934425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9935257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9935875Z dist init r=2, world=4 2022-11-23T03:12:17.9936310Z dist init r=0, world=4 2022-11-23T03:12:17.9936751Z dist init r=1, world=4 2022-11-23T03:12:17.9937175Z dist init r=3, world=4 2022-11-23T03:12:17.9937602Z ok (5.221s) 2022-11-23T03:12:17.9938244Z test_register_functions_called_cuda_first_True_mixed_precision_True (__main__.TestHooks) 2022-11-23T03:12:17.9939276Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9336 2022-11-23T03:12:17.9940289Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9337 2022-11-23T03:12:17.9941095Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9338 2022-11-23T03:12:17.9941962Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9339 2022-11-23T03:12:17.9943232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9944555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9945589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9946435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9947471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9948255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9949283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9950144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9951386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9952195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9953271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9954087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9955179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9955945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9956961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9957787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9958548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:17.9959462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:17.9960341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:17.9961227Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:17.9962342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9963608Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9964880Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9966094Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:17.9966977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:17.9967814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:17.9968700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:17.9969544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:17.9971525Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9972804Z warnings.warn( 2022-11-23T03:12:17.9974778Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9976070Z warnings.warn( 2022-11-23T03:12:17.9977919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9979191Z warnings.warn( 2022-11-23T03:12:17.9980971Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:17.9982285Z warnings.warn( 2022-11-23T03:12:17.9982713Z dist init r=2, world=4 2022-11-23T03:12:17.9983147Z dist init r=0, world=4 2022-11-23T03:12:17.9983558Z dist init r=3, world=4 2022-11-23T03:12:17.9984442Z dist init r=1, world=4 2022-11-23T03:12:17.9984831Z ok (5.220s) 2022-11-23T03:12:17.9985367Z test_transformer_no_grad_mixed_precision_False (__main__.TestNoGrad) 2022-11-23T03:12:17.9986517Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9621 2022-11-23T03:12:17.9987459Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9622 2022-11-23T03:12:17.9988225Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9623 2022-11-23T03:12:17.9988976Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9624 2022-11-23T03:12:17.9990011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9990818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9991806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9992643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9993666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9994439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9995403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9996231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:17.9997313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:17.9998080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:17.9999110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:17.9999967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0001022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0001785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0002801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0003634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0004571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0005435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0006300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0007233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0008375Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0009617Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0010913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0012266Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0013175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0014011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0014822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0015695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0018036Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0019447Z warnings.warn( 2022-11-23T03:12:18.0021529Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0022931Z warnings.warn( 2022-11-23T03:12:18.0025468Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0026806Z warnings.warn( 2022-11-23T03:12:18.0028901Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0030221Z warnings.warn( 2022-11-23T03:12:18.0030630Z dist init r=1, world=4 2022-11-23T03:12:18.0031046Z dist init r=0, world=4 2022-11-23T03:12:18.0031501Z dist init r=3, world=4 2022-11-23T03:12:18.0031923Z dist init r=2, world=4 2022-11-23T03:12:18.0032344Z ok (5.221s) 2022-11-23T03:12:18.0032912Z test_transformer_no_grad_mixed_precision_True (__main__.TestNoGrad) 2022-11-23T03:12:18.0034193Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9922 2022-11-23T03:12:18.0035291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9923 2022-11-23T03:12:18.0036130Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9924 2022-11-23T03:12:18.0036955Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9925 2022-11-23T03:12:18.0038062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0038867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0039858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0040764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0041905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0042675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0043728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0044513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0045574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0046369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0047449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0048291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0049350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0050134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0051191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0052016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0052875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0053788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0054633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0055499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0056629Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0057897Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0059122Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0060350Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0061282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0062139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0062962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0063804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0066396Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0067656Z warnings.warn( 2022-11-23T03:12:18.0069546Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0070783Z warnings.warn( 2022-11-23T03:12:18.0072678Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0074038Z warnings.warn( 2022-11-23T03:12:18.0076075Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0077446Z warnings.warn( 2022-11-23T03:12:18.0079283Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0080532Z warnings.warn( 2022-11-23T03:12:18.0082640Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0084024Z warnings.warn( 2022-11-23T03:12:18.0086132Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0087497Z warnings.warn( 2022-11-23T03:12:18.0089619Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0091060Z warnings.warn( 2022-11-23T03:12:18.0091472Z dist init r=1, world=4 2022-11-23T03:12:18.0091904Z dist init r=2, world=4 2022-11-23T03:12:18.0092353Z dist init r=0, world=4 2022-11-23T03:12:18.0092770Z dist init r=3, world=4 2022-11-23T03:12:18.0093247Z ok (5.221s) 2022-11-23T03:12:18.0093851Z test_param_change_after_init_mixed_precision_False (__main__.TestParamInit) 2022-11-23T03:12:18.0095066Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10223 2022-11-23T03:12:18.0096053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10224 2022-11-23T03:12:18.0096846Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10225 2022-11-23T03:12:18.0097643Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10226 2022-11-23T03:12:18.0098784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0099580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0100601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0101711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0102518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0103420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0104820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0105650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0106714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0107547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0108563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0109367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0110422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0111275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0112305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0113150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0113964Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0114913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0115808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0116707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0117935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0119154Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0120384Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0121587Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0122494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0123321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0124112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0125085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0127417Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0128791Z warnings.warn( 2022-11-23T03:12:18.0130943Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0132437Z warnings.warn( 2022-11-23T03:12:18.0134537Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0135881Z warnings.warn( 2022-11-23T03:12:18.0137915Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0139289Z warnings.warn( 2022-11-23T03:12:18.0139774Z dist init r=0, world=4 2022-11-23T03:12:18.0140178Z dist init r=3, world=4 2022-11-23T03:12:18.0140629Z dist init r=2, world=4 2022-11-23T03:12:18.0141081Z dist init r=1, world=4 2022-11-23T03:12:18.0141462Z ok (5.120s) 2022-11-23T03:12:18.0142095Z test_param_change_after_init_mixed_precision_True (__main__.TestParamInit) 2022-11-23T03:12:18.0143309Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10508 2022-11-23T03:12:18.0144666Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10509 2022-11-23T03:12:18.0145444Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10510 2022-11-23T03:12:18.0146215Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10511 2022-11-23T03:12:18.0147306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0148087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0149084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0149948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0151014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0151764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0152769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0153623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0154794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0155552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0156563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0157399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0158432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0159200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0160213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0161043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0161917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0162781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0163691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0164583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0165746Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0166975Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0168126Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0169326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0170218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0171067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0171909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0172736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0174787Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0176029Z warnings.warn( 2022-11-23T03:12:18.0177904Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0179145Z warnings.warn( 2022-11-23T03:12:18.0180958Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0182194Z warnings.warn( 2022-11-23T03:12:18.0184683Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T03:12:18.0185927Z warnings.warn( 2022-11-23T03:12:18.0188060Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0189451Z warnings.warn( 2022-11-23T03:12:18.0191552Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0192993Z warnings.warn( 2022-11-23T03:12:18.0195028Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0196401Z warnings.warn( 2022-11-23T03:12:18.0198499Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0199835Z warnings.warn( 2022-11-23T03:12:18.0200254Z dist init r=1, world=4 2022-11-23T03:12:18.0200663Z dist init r=3, world=4 2022-11-23T03:12:18.0201080Z dist init r=0, world=4 2022-11-23T03:12:18.0201507Z dist init r=2, world=4 2022-11-23T03:12:18.0201880Z ok (5.220s) 2022-11-23T03:12:18.0202461Z test_delayed_optim_step_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T03:12:18.0203423Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10793 2022-11-23T03:12:18.0204376Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10794 2022-11-23T03:12:18.0205176Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10795 2022-11-23T03:12:18.0206013Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10796 2022-11-23T03:12:18.0207164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0207929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0208981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0209806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0210938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0211703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0212818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0213670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0214682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0215461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0216526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0217410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0218454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0219245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0220353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0221174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0221944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0222848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0223756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0225072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0226191Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0227376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0228642Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0229899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0230856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0231730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0232617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0233484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0234288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0235175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0236046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0236833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0239151Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0240529Z warnings.warn( 2022-11-23T03:12:18.0242838Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0244252Z warnings.warn( 2022-11-23T03:12:18.0246340Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0247674Z warnings.warn( 2022-11-23T03:12:18.0249694Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0251210Z warnings.warn( 2022-11-23T03:12:18.0251869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0252698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0253532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0254386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0255232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0256050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0256882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0257710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0258556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0259428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0260280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0261081Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0261917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0262707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0263539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0264795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0265634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0266459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0267321Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0268175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0268986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0269813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0270611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0271542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0272367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0273213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0274038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0274843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0275642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0276484Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0277355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0278158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0279111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0279989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0280800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0281612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0283474Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0285744Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0288007Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0290221Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0291579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0292406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0293284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0294194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0295005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0295878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0296727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0297546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0298370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0299231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0300058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0300946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0301793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0302651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0303458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0304653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0305506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0306341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0307184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0308092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0308952Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0309790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0310727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0311515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0312349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0313153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0313981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0314803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0315655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0316531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0317306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0318142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0318959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0319771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0320564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0321416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0322293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0323125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0323992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0324804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0325657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0326533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0327381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0328231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0329030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0329816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0330773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0331627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0333413Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0335777Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0338061Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0340339Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0341614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0342462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0343251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0344549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0345447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0346245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0347029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0347818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0348399Z dist init r=1, world=4 2022-11-23T03:12:18.0348812Z dist init r=0, world=4 2022-11-23T03:12:18.0349251Z dist init r=3, world=4 2022-11-23T03:12:18.0349675Z dist init r=2, world=4 2022-11-23T03:12:18.0350082Z ok (20.755s) 2022-11-23T03:12:18.0350669Z test_delayed_optim_step_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T03:12:18.0351639Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11094 2022-11-23T03:12:18.0352606Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11095 2022-11-23T03:12:18.0353420Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11096 2022-11-23T03:12:18.0354196Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11097 2022-11-23T03:12:18.0355254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0356048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0357089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0357878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0358945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0359719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0360898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0361747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0362807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0363546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0364521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0365432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0366478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0367239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0368402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0369239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0369974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0370866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0371670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0372525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0373641Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0374844Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0376084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0377301Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0378172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0378996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0379809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0380635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0381416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0382247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0383130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0384380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0386667Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0388062Z warnings.warn( 2022-11-23T03:12:18.0390249Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0391672Z warnings.warn( 2022-11-23T03:12:18.0393793Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0395200Z warnings.warn( 2022-11-23T03:12:18.0397280Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0398777Z warnings.warn( 2022-11-23T03:12:18.0399412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0400262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0401101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0401946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0402798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0403680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0404472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0405348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0406200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0406994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0407817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0408697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0409538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0410402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0411306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0412128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0412961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0413800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0414644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0415455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0416274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0417125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0417970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0418871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0419687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0420521Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0421307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0422149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0422934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0423744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0424990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0425818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0426741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0427582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0428433Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0429243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0431112Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0433399Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0435618Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0437854Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0439159Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0439987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0440812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0441647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0442442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0443272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0444151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0445006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0445854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0446688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0447529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0448455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0449248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0450120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0450966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0451818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0452646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0453544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0454374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0455283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0456069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0456868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0457697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0458499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0459331Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0460202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0461031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0461828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0462664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0463502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0464748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0465557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0466452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0467312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0468157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0469015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0469804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0470621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0471448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0472284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0473172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0474016Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0474867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0475705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0476532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0477306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0478223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0479075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0516375Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0517650Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0519038Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0520249Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0520965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0521456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0521916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0522401Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0522870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0523337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0523791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0524257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0524608Z dist init r=1, world=4 2022-11-23T03:12:18.0524839Z dist init r=3, world=4 2022-11-23T03:12:18.0525081Z dist init r=2, world=4 2022-11-23T03:12:18.0525323Z dist init r=0, world=4 2022-11-23T03:12:18.0525538Z ok (29.875s) 2022-11-23T03:12:18.0525882Z test_delayed_optim_step_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T03:12:18.0526427Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11395 2022-11-23T03:12:18.0526964Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11396 2022-11-23T03:12:18.0527392Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11397 2022-11-23T03:12:18.0527884Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11398 2022-11-23T03:12:18.0528505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0528941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0529513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0529978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0530553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0531130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0531714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0532171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0532743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0533165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0533729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0534188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0534739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0535238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0535807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0536275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0536712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0537219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0537722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0538199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0538862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0539561Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0540250Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0540906Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0541426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0541902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0542378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0542827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0543299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0543780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0545024Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0545895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0547356Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0548142Z warnings.warn( 2022-11-23T03:12:18.0549385Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0550173Z warnings.warn( 2022-11-23T03:12:18.0551302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0552056Z warnings.warn( 2022-11-23T03:12:18.0553195Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0554033Z warnings.warn( 2022-11-23T03:12:18.0554411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0554880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0555366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0555849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0556307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0556798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0557277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0557759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0558219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0558777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0559261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0559712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0560192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0560663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0561139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0561595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0562074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0562549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0563024Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0563475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0563947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0564422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0564884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0565417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0565900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0566370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0566828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0567306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0567783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0568284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0568759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0569289Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0569765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0570216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0570688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0571157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0572167Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0573394Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0574628Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0575863Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0576591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0577082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0577543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0578027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0578508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0578983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0579438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0579912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0580397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0580850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0581382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0581855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0582320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0582775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0583239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0583703Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0584469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0584939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0585405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0585954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0586406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0586871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0587335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0587803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0588252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0588712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0589179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0589634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0590101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0590565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0591027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0591478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0591945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0592409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0592858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0593320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0593789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0594251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0594701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0595164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0595623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0596072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0596541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0597006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0597469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0597980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0598453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0598919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0599913Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0601127Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0602422Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0603642Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.0604363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0604840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0605289Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0605767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0606235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0606703Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0607153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0607615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0607966Z dist init r=3, world=4 2022-11-23T03:12:18.0608198Z dist init r=1, world=4 2022-11-23T03:12:18.0608439Z dist init r=0, world=4 2022-11-23T03:12:18.0608681Z dist init r=2, world=4 2022-11-23T03:12:18.0608895Z ok (29.776s) 2022-11-23T03:12:18.0609231Z test_delayed_optim_step_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T03:12:18.0610374Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82490 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:12:18.0611215Z test_delayed_optim_step_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T03:12:18.0611745Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11696 2022-11-23T03:12:18.0612253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11697 2022-11-23T03:12:18.0612699Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11698 2022-11-23T03:12:18.0613144Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11699 2022-11-23T03:12:18.0613791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0614250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0614825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0615291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0615848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0616283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0616848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0617288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0617856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0618346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0618908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0619349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0619912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.0620346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.0620908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.0621346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.0621793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.0622294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.0622768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.0623261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.0624217Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0625534Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0626537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0627217Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.0627744Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.0628221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.0628671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.0629125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.0629599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0630062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0630538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0631010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0632359Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0633147Z warnings.warn( 2022-11-23T03:12:18.0634282Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0635047Z warnings.warn( 2022-11-23T03:12:18.0636429Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0637245Z warnings.warn( 2022-11-23T03:12:18.0638433Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.0639185Z warnings.warn( 2022-11-23T03:12:18.0639436Z File "", line 1, in 2022-11-23T03:12:18.0639721Z File "", line 1, in 2022-11-23T03:12:18.0640089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0640442Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0640808Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0641174Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0641535Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0641895Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0642109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0642212Z self.run() 2022-11-23T03:12:18.0642413Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0642560Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0642752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0642898Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0643110Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0643213Z self.run() 2022-11-23T03:12:18.0643560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0643693Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0643894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0644039Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0644388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0644511Z getattr(self, test_name)() 2022-11-23T03:12:18.0644848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0645045Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0645418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0645515Z fn() 2022-11-23T03:12:18.0645875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0645997Z getattr(self, test_name)() 2022-11-23T03:12:18.0646346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0646466Z test(self, **param_kwargs) 2022-11-23T03:12:18.0646823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0646918Z fn() 2022-11-23T03:12:18.0647272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0647442Z return func(*args, **kwargs) 2022-11-23T03:12:18.0647574Z File "", line 1, in 2022-11-23T03:12:18.0647923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0648044Z test(self, **param_kwargs) 2022-11-23T03:12:18.0648293Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0648405Z self.run_subtests( 2022-11-23T03:12:18.0648534Z File "", line 1, in 2022-11-23T03:12:18.0648744Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0648885Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0649240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0649346Z return func(*args, **kwargs) 2022-11-23T03:12:18.0649706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0649868Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0650075Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0650216Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0650414Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0650563Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0650810Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0650904Z self.run_subtests( 2022-11-23T03:12:18.0651267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0651419Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0651620Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0651770Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0652145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0652263Z output = model(*input) 2022-11-23T03:12:18.0652474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0652560Z self.run() 2022-11-23T03:12:18.0652887Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0653026Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0653378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0653540Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0653752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0653903Z self.run() 2022-11-23T03:12:18.0654095Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0654243Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0654446Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0654587Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0654964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0655141Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0655503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0655651Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0655971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0656156Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0656497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0656629Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0656996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0657115Z _lazy_init(state, module) 2022-11-23T03:12:18.0657490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0657683Z output = model(*input) 2022-11-23T03:12:18.0658033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0658201Z getattr(self, test_name)() 2022-11-23T03:12:18.0658747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0658867Z getattr(self, test_name)() 2022-11-23T03:12:18.0659265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0659455Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0659822Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0659996Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0660392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0660525Z fn() 2022-11-23T03:12:18.0660941Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0661099Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0661552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0661697Z fn() 2022-11-23T03:12:18.0662080Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0662240Z return func(*args, **kwargs) 2022-11-23T03:12:18.0662648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0662807Z test(self, **param_kwargs) 2022-11-23T03:12:18.0663213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0663318Z _lazy_init(state, module) 2022-11-23T03:12:18.0663752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0664159Z test(self, **param_kwargs) 2022-11-23T03:12:18.0664749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0664906Z p_assert( 2022-11-23T03:12:18.0665307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0665468Z return func(*args, **kwargs) 2022-11-23T03:12:18.0665849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0665957Z traceback.print_stack() 2022-11-23T03:12:18.0666348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0666537Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0666977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0667140Z return func(*args, **kwargs) 2022-11-23T03:12:18.0667430Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0667656Z self.run_subtests( 2022-11-23T03:12:18.0668077Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0668195Z return func(*args, **kwargs) 2022-11-23T03:12:18.0668494Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0668680Z self.run_subtests( 2022-11-23T03:12:18.0669074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0669314Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0669738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0669873Z p_assert( 2022-11-23T03:12:18.0670263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0670418Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0670832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0671022Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0671396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0671559Z traceback.print_stack() 2022-11-23T03:12:18.0671998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0672190Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0672608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0672712Z output = model(*input) 2022-11-23T03:12:18.0673144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0673302Z output = model(*input) 2022-11-23T03:12:18.0673697Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0673873Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0674231Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0674441Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0674863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0675023Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0675447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0675660Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0676122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0676395Z _lazy_init(state, module) 2022-11-23T03:12:18.0676817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0676977Z _lazy_init(state, module) 2022-11-23T03:12:18.0677422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0677553Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0677949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0678123Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0678534Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0678756Z return func(*args, **kwargs) 2022-11-23T03:12:18.0679136Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0679306Z return func(*args, **kwargs) 2022-11-23T03:12:18.0679722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0679808Z p_assert( 2022-11-23T03:12:18.0680269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0680404Z p_assert( 2022-11-23T03:12:18.0680784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0680948Z traceback.print_stack() 2022-11-23T03:12:18.0681322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0681500Z traceback.print_stack() 2022-11-23T03:12:18.0681801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0682022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0682296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0682599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0682765Z File "", line 1, in 2022-11-23T03:12:18.0683047Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0683226Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0683474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0683660Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0683859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0684006Z self.run() 2022-11-23T03:12:18.0684247Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0684466Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0684856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0685024Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0685438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0685546Z getattr(self, test_name)() 2022-11-23T03:12:18.0685948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0686082Z fn() 2022-11-23T03:12:18.0686485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0686649Z test(self, **param_kwargs) 2022-11-23T03:12:18.0687135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0687342Z return func(*args, **kwargs) 2022-11-23T03:12:18.0687640Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0687736Z self.run_subtests( 2022-11-23T03:12:18.0688128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0688326Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0688728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0688915Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0689328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0689585Z output = model(*input) 2022-11-23T03:12:18.0689958Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0690083Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0690498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0690712Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0691113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0691268Z _lazy_init(state, module) 2022-11-23T03:12:18.0691712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0691901Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0692343Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0692463Z return func(*args, **kwargs) 2022-11-23T03:12:18.0692635Z File "", line 1, in 2022-11-23T03:12:18.0693052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0693189Z p_assert( 2022-11-23T03:12:18.0693562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0693724Z traceback.print_stack() 2022-11-23T03:12:18.0693983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0694161Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0694347Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0694567Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0694819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0694967Z self.run() 2022-11-23T03:12:18.0695211Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0695391Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0695781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0695950Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0696300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0696457Z getattr(self, test_name)() 2022-11-23T03:12:18.0696917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0697053Z fn() 2022-11-23T03:12:18.0697455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0697619Z test(self, **param_kwargs) 2022-11-23T03:12:18.0698069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0698183Z return func(*args, **kwargs) 2022-11-23T03:12:18.0698473Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0698622Z self.run_subtests( 2022-11-23T03:12:18.0699014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0699247Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0699654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0699845Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0700271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0700428Z output = model(*input) 2022-11-23T03:12:18.0700797Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0700979Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0701424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0701634Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0702077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0702246Z _lazy_init(state, module) 2022-11-23T03:12:18.0702639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0702815Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0703148Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0703309Z return func(*args, **kwargs) 2022-11-23T03:12:18.0703724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0704182Z p_assert( 2022-11-23T03:12:18.0704897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0705252Z traceback.print_stack() 2022-11-23T03:12:18.0705529Z File "", line 1, in 2022-11-23T03:12:18.0705881Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0706181Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0706589Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0706865Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0707127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0707308Z self.run() 2022-11-23T03:12:18.0707487Z File "", line 1, in 2022-11-23T03:12:18.0707769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0707907Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0708295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0708464Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0708715Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0708891Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0709293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0709464Z getattr(self, test_name)() 2022-11-23T03:12:18.0709649Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0709936Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0710389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0710527Z fn() 2022-11-23T03:12:18.0710840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0710982Z self.run() 2022-11-23T03:12:18.0711390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0711595Z test(self, **param_kwargs) 2022-11-23T03:12:18.0711783Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0711966Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0712364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0712639Z return func(*args, **kwargs) 2022-11-23T03:12:18.0713030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0713200Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0713499Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0713650Z self.run_subtests( 2022-11-23T03:12:18.0713996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0845475Z getattr(self, test_name)() 2022-11-23T03:12:18.0846054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0846219Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0846585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0846697Z fn() 2022-11-23T03:12:18.0847058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0847206Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0847573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0847686Z test(self, **param_kwargs) 2022-11-23T03:12:18.0848058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0848165Z output = model(*input) 2022-11-23T03:12:18.0848516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0848630Z return func(*args, **kwargs) 2022-11-23T03:12:18.0848940Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0849081Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0849323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0849424Z self.run_subtests( 2022-11-23T03:12:18.0849795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0849967Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0850307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0850456Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0850806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0850923Z _lazy_init(state, module) 2022-11-23T03:12:18.0851277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0851672Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0852049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0852181Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0852554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0852661Z output = model(*input) 2022-11-23T03:12:18.0852980Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0853100Z return func(*args, **kwargs) 2022-11-23T03:12:18.0853411Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0853541Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0854010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0854102Z p_assert( 2022-11-23T03:12:18.0854475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0854641Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0854958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0855080Z traceback.print_stack() 2022-11-23T03:12:18.0855434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0855551Z _lazy_init(state, module) 2022-11-23T03:12:18.0855887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0856021Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0856356Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0856475Z return func(*args, **kwargs) 2022-11-23T03:12:18.0856834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0856924Z p_assert( 2022-11-23T03:12:18.0857254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0857367Z traceback.print_stack() 2022-11-23T03:12:18.0857601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0857826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0858046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0858270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0858386Z File "", line 1, in 2022-11-23T03:12:18.0858503Z File "", line 1, in 2022-11-23T03:12:18.0858639Z File "", line 1, in 2022-11-23T03:12:18.0858813Z File "", line 1, in 2022-11-23T03:12:18.0859030Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0859159Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0859367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0859494Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0859686Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0859819Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0860011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0860140Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0860394Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0860548Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0860738Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0860868Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0861063Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0861198Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0861399Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0861536Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0861744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0861836Z self.run() 2022-11-23T03:12:18.0862032Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0862167Z self.run() 2022-11-23T03:12:18.0862383Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0862484Z self.run() 2022-11-23T03:12:18.0862678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0862775Z self.run() 2022-11-23T03:12:18.0862963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0863105Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0863290Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0863421Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0863607Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0863742Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0864151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0864305Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0864655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0864785Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0865107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0865226Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0865557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0865677Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0866002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0866120Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0866479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0866595Z getattr(self, test_name)() 2022-11-23T03:12:18.0866945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0867064Z getattr(self, test_name)() 2022-11-23T03:12:18.0867414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0867501Z fn() 2022-11-23T03:12:18.0867852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0867962Z getattr(self, test_name)() 2022-11-23T03:12:18.0868357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0868469Z getattr(self, test_name)() 2022-11-23T03:12:18.0868811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0868900Z fn() 2022-11-23T03:12:18.0869327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0869419Z fn() 2022-11-23T03:12:18.0869785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0869895Z test(self, **param_kwargs) 2022-11-23T03:12:18.0870241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0870318Z fn() 2022-11-23T03:12:18.0870669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0870786Z test(self, **param_kwargs) 2022-11-23T03:12:18.0871135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0871252Z test(self, **param_kwargs) 2022-11-23T03:12:18.0871672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0871793Z test(self, **param_kwargs) 2022-11-23T03:12:18.0872131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0872238Z return func(*args, **kwargs) 2022-11-23T03:12:18.0872590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0872701Z return func(*args, **kwargs) 2022-11-23T03:12:18.0873056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0873166Z return func(*args, **kwargs) 2022-11-23T03:12:18.0873508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0873618Z return func(*args, **kwargs) 2022-11-23T03:12:18.0873870Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0873966Z self.run_subtests( 2022-11-23T03:12:18.0874200Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0874299Z self.run_subtests( 2022-11-23T03:12:18.0874532Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0874632Z self.run_subtests( 2022-11-23T03:12:18.0874868Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0874974Z self.run_subtests( 2022-11-23T03:12:18.0875316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0875460Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0875806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0875962Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0876293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0876448Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0876785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0876929Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0877277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0877412Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0877875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0878067Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0878423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0878560Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0878921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0879039Z output = model(*input) 2022-11-23T03:12:18.0879398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0879496Z output = model(*input) 2022-11-23T03:12:18.0879842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0879979Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0880398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0880511Z output = model(*input) 2022-11-23T03:12:18.0880820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0880948Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0881261Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0881382Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0881704Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0881841Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0882202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0882309Z output = model(*input) 2022-11-23T03:12:18.0882680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0882845Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0883209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0883377Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0883726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0883887Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0884204Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0884331Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0884686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0884810Z _lazy_init(state, module) 2022-11-23T03:12:18.0885161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0885275Z _lazy_init(state, module) 2022-11-23T03:12:18.0885608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0885740Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0886092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0886200Z _lazy_init(state, module) 2022-11-23T03:12:18.0886561Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0886722Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0887151Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0887288Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0887622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0887752Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0888085Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0888197Z return func(*args, **kwargs) 2022-11-23T03:12:18.0888558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0888664Z _lazy_init(state, module) 2022-11-23T03:12:18.0888991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0889102Z return func(*args, **kwargs) 2022-11-23T03:12:18.0889484Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0889597Z return func(*args, **kwargs) 2022-11-23T03:12:18.0889964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0890054Z p_assert( 2022-11-23T03:12:18.0890398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0890528Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0890890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0890986Z p_assert( 2022-11-23T03:12:18.0891303Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0891416Z return func(*args, **kwargs) 2022-11-23T03:12:18.0891795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0891885Z p_assert( 2022-11-23T03:12:18.0892210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0892324Z traceback.print_stack() 2022-11-23T03:12:18.0892655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0892768Z traceback.print_stack() 2022-11-23T03:12:18.0893123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0893225Z p_assert( 2022-11-23T03:12:18.0893545Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0893657Z traceback.print_stack() 2022-11-23T03:12:18.0893989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0894104Z traceback.print_stack() 2022-11-23T03:12:18.0894348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0894565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0894782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0895004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0895122Z File "", line 1, in 2022-11-23T03:12:18.0895331Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0895460Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0895649Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0895786Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0895986Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0896139Z self.run() 2022-11-23T03:12:18.0896342Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0896476Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0896818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0896951Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0897300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0897423Z getattr(self, test_name)() 2022-11-23T03:12:18.0897763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0897860Z fn() 2022-11-23T03:12:18.0898212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0898471Z test(self, **param_kwargs) 2022-11-23T03:12:18.0898831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0898944Z return func(*args, **kwargs) 2022-11-23T03:12:18.0899192Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0899301Z self.run_subtests( 2022-11-23T03:12:18.0899633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0899784Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0900144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0900295Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0900656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0900778Z output = model(*input) 2022-11-23T03:12:18.0901096Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0901233Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0901591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0901757Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0902112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0902233Z _lazy_init(state, module) 2022-11-23T03:12:18.0902576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0902709Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0903043Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0903163Z return func(*args, **kwargs) 2022-11-23T03:12:18.0903527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0903619Z p_assert( 2022-11-23T03:12:18.0904297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0904512Z traceback.print_stack() 2022-11-23T03:12:18.0904724Z File "", line 1, in 2022-11-23T03:12:18.0905060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0905283Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0905591Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0905835Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0906325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0906502Z self.run() 2022-11-23T03:12:18.0906835Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0907055Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0907496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0907620Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0907964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0908086Z getattr(self, test_name)() 2022-11-23T03:12:18.0908435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0908530Z fn() 2022-11-23T03:12:18.0908882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0909080Z test(self, **param_kwargs) 2022-11-23T03:12:18.0909429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0909549Z return func(*args, **kwargs) 2022-11-23T03:12:18.0909781Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0909882Z self.run_subtests( 2022-11-23T03:12:18.0910229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0910379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0910799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0910941Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0911317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0911432Z output = model(*input) 2022-11-23T03:12:18.0911737Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0911868Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0912241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0912405Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0912765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0912876Z _lazy_init(state, module) 2022-11-23T03:12:18.0913222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0913354Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0913679Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0913800Z return func(*args, **kwargs) 2022-11-23T03:12:18.0914169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0914267Z p_assert( 2022-11-23T03:12:18.0914596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0914722Z traceback.print_stack() 2022-11-23T03:12:18.0914845Z File "", line 1, in 2022-11-23T03:12:18.0915059Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0915182Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0915372Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0915519Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0915778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0915879Z self.run() 2022-11-23T03:12:18.0916082Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0916230Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0916553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0916675Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0916798Z File "", line 1, in 2022-11-23T03:12:18.0917148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0917270Z getattr(self, test_name)() 2022-11-23T03:12:18.0917468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0917597Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0918007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0918087Z fn() 2022-11-23T03:12:18.0918280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0918428Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0918784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0918903Z test(self, **param_kwargs) 2022-11-23T03:12:18.0919101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0919201Z self.run() 2022-11-23T03:12:18.0919550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0919656Z return func(*args, **kwargs) 2022-11-23T03:12:18.0919856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0919995Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0920246Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0920349Z self.run_subtests( 2022-11-23T03:12:18.0920687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0920812Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0921145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0921297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0921656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0921771Z getattr(self, test_name)() 2022-11-23T03:12:18.0922125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0922274Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0922625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0922724Z fn() 2022-11-23T03:12:18.0923081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0923191Z output = model(*input) 2022-11-23T03:12:18.0923550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0923671Z test(self, **param_kwargs) 2022-11-23T03:12:18.0923984Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0924121Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0924470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0924644Z return func(*args, **kwargs) 2022-11-23T03:12:18.0925007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0925174Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0925411Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0925519Z self.run_subtests( 2022-11-23T03:12:18.0925874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0925984Z _lazy_init(state, module) 2022-11-23T03:12:18.0926331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0926482Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0926869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0927011Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0927365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0927510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0927845Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0927961Z return func(*args, **kwargs) 2022-11-23T03:12:18.0928329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0928438Z output = model(*input) 2022-11-23T03:12:18.0928814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0928904Z p_assert( 2022-11-23T03:12:18.0929222Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0929358Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0929701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0929815Z traceback.print_stack() 2022-11-23T03:12:18.0930187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0930353Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0930718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0930822Z _lazy_init(state, module) 2022-11-23T03:12:18.0931161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0931306Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0931636Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0931761Z return func(*args, **kwargs) 2022-11-23T03:12:18.0932128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0932231Z p_assert( 2022-11-23T03:12:18.0932544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0932660Z traceback.print_stack() 2022-11-23T03:12:18.0932899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0933124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0933362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0933593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0933768Z File "", line 1, in 2022-11-23T03:12:18.0933894Z File "", line 1, in 2022-11-23T03:12:18.0934086Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0934228Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0934347Z File "", line 1, in 2022-11-23T03:12:18.0934545Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0934686Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0934890Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0935021Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0935339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0935430Z self.run() 2022-11-23T03:12:18.0935689Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0935822Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0936012Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0936150Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0936344Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0936477Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0936658Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0936806Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0937014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0937104Z self.run() 2022-11-23T03:12:18.0937443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0937579Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0937782Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0937880Z self.run() 2022-11-23T03:12:18.0937989Z File "", line 1, in 2022-11-23T03:12:18.0938181Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0938312Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0938665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0938778Z getattr(self, test_name)() 2022-11-23T03:12:18.0938978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0939111Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0939299Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0939430Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0939766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0939898Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0940255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0940344Z fn() 2022-11-23T03:12:18.0940674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0940795Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0940979Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0941117Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0941468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0941588Z getattr(self, test_name)() 2022-11-23T03:12:18.0941937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0942103Z test(self, **param_kwargs) 2022-11-23T03:12:18.0942467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0942577Z getattr(self, test_name)() 2022-11-23T03:12:18.0942771Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0942872Z self.run() 2022-11-23T03:12:18.0943220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0943308Z fn() 2022-11-23T03:12:18.0943644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0943735Z fn() 2022-11-23T03:12:18.0944538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0944879Z test(self, **param_kwargs) 2022-11-23T03:12:18.0945562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0945776Z return func(*args, **kwargs) 2022-11-23T03:12:18.0946124Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0946266Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0946516Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0946627Z self.run_subtests( 2022-11-23T03:12:18.0946990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0947094Z test(self, **param_kwargs) 2022-11-23T03:12:18.0947448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0947565Z return func(*args, **kwargs) 2022-11-23T03:12:18.0947903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0948025Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0948374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0948527Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0948881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0948986Z return func(*args, **kwargs) 2022-11-23T03:12:18.0949235Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0949345Z self.run_subtests( 2022-11-23T03:12:18.0949700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0949826Z getattr(self, test_name)() 2022-11-23T03:12:18.0950186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0950335Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0950573Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0950668Z self.run_subtests( 2022-11-23T03:12:18.0951042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0951154Z output = model(*input) 2022-11-23T03:12:18.0951503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0951656Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0952010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0952103Z fn() 2022-11-23T03:12:18.0952601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0952757Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0953076Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0953208Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0953570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0953721Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0954071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0954184Z test(self, **param_kwargs) 2022-11-23T03:12:18.0954553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0954708Z output = model(*input) 2022-11-23T03:12:18.0955060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0955209Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0955572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0955741Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0956092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0956205Z return func(*args, **kwargs) 2022-11-23T03:12:18.0956527Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0956660Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0957025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0957143Z output = model(*input) 2022-11-23T03:12:18.0957500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0957620Z _lazy_init(state, module) 2022-11-23T03:12:18.0957859Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0957972Z self.run_subtests( 2022-11-23T03:12:18.0958338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0958510Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0958817Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0958949Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0959301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0959444Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0959784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0959943Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0960296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0960409Z _lazy_init(state, module) 2022-11-23T03:12:18.0960754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0960969Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0961332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0961558Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0961898Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0962018Z return func(*args, **kwargs) 2022-11-23T03:12:18.0962374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0962495Z _lazy_init(state, module) 2022-11-23T03:12:18.0962826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0962970Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0963337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0963446Z output = model(*input) 2022-11-23T03:12:18.0963839Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0963952Z return func(*args, **kwargs) 2022-11-23T03:12:18.0964328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0964421Z p_assert( 2022-11-23T03:12:18.0964754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0964892Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0965205Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0965345Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0965746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0965838Z p_assert( 2022-11-23T03:12:18.0966220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0966390Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0966707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0966829Z traceback.print_stack() 2022-11-23T03:12:18.0967155Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0967272Z return func(*args, **kwargs) 2022-11-23T03:12:18.0967585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0967706Z traceback.print_stack() 2022-11-23T03:12:18.0968060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0968277Z _lazy_init(state, module) 2022-11-23T03:12:18.0968781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0968875Z p_assert( 2022-11-23T03:12:18.0969221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0969352Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0969684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0969798Z traceback.print_stack() 2022-11-23T03:12:18.0970133Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0970245Z return func(*args, **kwargs) 2022-11-23T03:12:18.0970604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0970699Z p_assert( 2022-11-23T03:12:18.0971019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0971190Z traceback.print_stack() 2022-11-23T03:12:18.0971425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0971659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0971880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0972102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.0972215Z File "", line 1, in 2022-11-23T03:12:18.0972417Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0972557Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0972749Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0972894Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0973150Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0973256Z self.run() 2022-11-23T03:12:18.0973437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0973573Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0973917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0974045Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0974409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0974520Z getattr(self, test_name)() 2022-11-23T03:12:18.0974873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0974962Z fn() 2022-11-23T03:12:18.0975310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0975437Z test(self, **param_kwargs) 2022-11-23T03:12:18.0975791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0975912Z return func(*args, **kwargs) 2022-11-23T03:12:18.0976155Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0976263Z self.run_subtests( 2022-11-23T03:12:18.0976602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0976762Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0977108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0977246Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0977622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0977731Z output = model(*input) 2022-11-23T03:12:18.0978051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0978180Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0978550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0978723Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0979067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0979177Z _lazy_init(state, module) 2022-11-23T03:12:18.0979516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0979662Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0980035Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0980166Z return func(*args, **kwargs) 2022-11-23T03:12:18.0980539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0980637Z p_assert( 2022-11-23T03:12:18.0980953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0981076Z traceback.print_stack() 2022-11-23T03:12:18.0981193Z File "", line 1, in 2022-11-23T03:12:18.0981395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0981537Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0981728Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0981918Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0982120Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0982214Z self.run() 2022-11-23T03:12:18.0982416Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0982550Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0982882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0983010Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0983361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0983472Z getattr(self, test_name)() 2022-11-23T03:12:18.0983811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0984205Z fn() 2022-11-23T03:12:18.0984920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0985150Z test(self, **param_kwargs) 2022-11-23T03:12:18.0985841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0986057Z return func(*args, **kwargs) 2022-11-23T03:12:18.0986345Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0986449Z self.run_subtests( 2022-11-23T03:12:18.0986788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0986946Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0987298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0987447Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0987829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0987940Z output = model(*input) 2022-11-23T03:12:18.0988263Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0988394Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0988756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.0988924Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.0989286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.0989396Z _lazy_init(state, module) 2022-11-23T03:12:18.0989735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.0989881Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.0990289Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.0990419Z return func(*args, **kwargs) 2022-11-23T03:12:18.0990785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.0990883Z p_assert( 2022-11-23T03:12:18.0991218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.0991342Z traceback.print_stack() 2022-11-23T03:12:18.0991463Z File "", line 1, in 2022-11-23T03:12:18.0991665Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0991802Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0991994Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0992195Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0992411Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0992509Z self.run() 2022-11-23T03:12:18.0992708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0992846Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0993180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0993308Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0993654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0993771Z getattr(self, test_name)() 2022-11-23T03:12:18.0994128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.0994215Z fn() 2022-11-23T03:12:18.0994582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.0994694Z test(self, **param_kwargs) 2022-11-23T03:12:18.0994817Z File "", line 1, in 2022-11-23T03:12:18.0995162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.0995267Z return func(*args, **kwargs) 2022-11-23T03:12:18.0995475Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.0995606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.0995850Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.0995952Z self.run_subtests( 2022-11-23T03:12:18.0996151Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.0996291Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.0996649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.0996792Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.0996995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.0997096Z self.run() 2022-11-23T03:12:18.0997451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.0997604Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.0997801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.0997944Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.0998312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.0998413Z output = model(*input) 2022-11-23T03:12:18.0998797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.0998924Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.0999250Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.0999381Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.0999737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.0999856Z getattr(self, test_name)() 2022-11-23T03:12:18.1000227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1000384Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1000732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1000829Z fn() 2022-11-23T03:12:18.1001237Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1001358Z _lazy_init(state, module) 2022-11-23T03:12:18.1001712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1001833Z test(self, **param_kwargs) 2022-11-23T03:12:18.1002176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1002301Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1002655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1002773Z return func(*args, **kwargs) 2022-11-23T03:12:18.1003107Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1003219Z return func(*args, **kwargs) 2022-11-23T03:12:18.1003472Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1003580Z self.run_subtests( 2022-11-23T03:12:18.1003958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1004044Z p_assert( 2022-11-23T03:12:18.1004387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1004546Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1004869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1004993Z traceback.print_stack() 2022-11-23T03:12:18.1005347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1005495Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1005867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1005968Z output = model(*input) 2022-11-23T03:12:18.1006290Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1006420Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1006791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1006954Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1007315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1007425Z _lazy_init(state, module) 2022-11-23T03:12:18.1007770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1007897Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1008270Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1008398Z return func(*args, **kwargs) 2022-11-23T03:12:18.1008769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1008870Z p_assert( 2022-11-23T03:12:18.1009194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1009319Z traceback.print_stack() 2022-11-23T03:12:18.1009539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1009765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1009996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1010269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1010396Z File "", line 1, in 2022-11-23T03:12:18.1010601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1010740Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1010917Z File "", line 1, in 2022-11-23T03:12:18.1011104Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1011253Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1011459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1011560Z self.run() 2022-11-23T03:12:18.1011759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1011898Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1012098Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1012246Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1012428Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1012567Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1012693Z File "", line 1, in 2022-11-23T03:12:18.1013032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1013164Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1013364Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1013464Z self.run() 2022-11-23T03:12:18.1013807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1013924Z getattr(self, test_name)() 2022-11-23T03:12:18.1014131Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1014267Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1014473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1014617Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1014970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1015058Z fn() 2022-11-23T03:12:18.1015239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1015386Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1015715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1015845Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1015964Z File "", line 1, in 2022-11-23T03:12:18.1016331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1016520Z test(self, **param_kwargs) 2022-11-23T03:12:18.1016734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1016819Z self.run() 2022-11-23T03:12:18.1017178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1017291Z getattr(self, test_name)() 2022-11-23T03:12:18.1017483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1017627Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1017976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1018096Z return func(*args, **kwargs) 2022-11-23T03:12:18.1018286Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1018420Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1018840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1018930Z fn() 2022-11-23T03:12:18.1019265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1019396Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1019641Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1019753Z self.run_subtests( 2022-11-23T03:12:18.1019936Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1020077Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1020442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1020554Z test(self, **param_kwargs) 2022-11-23T03:12:18.1020910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1021030Z getattr(self, test_name)() 2022-11-23T03:12:18.1021377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1021534Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1021874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1021971Z fn() 2022-11-23T03:12:18.1022173Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1022275Z self.run() 2022-11-23T03:12:18.1022620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1022741Z return func(*args, **kwargs) 2022-11-23T03:12:18.1023096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1023255Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1023596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1023713Z test(self, **param_kwargs) 2022-11-23T03:12:18.1024239Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1024511Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1024965Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1025159Z self.run_subtests( 2022-11-23T03:12:18.1025894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1026106Z output = model(*input) 2022-11-23T03:12:18.1026498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1026702Z return func(*args, **kwargs) 2022-11-23T03:12:18.1027049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1027170Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1027521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1027674Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1027997Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1028136Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1028371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1028477Z self.run_subtests( 2022-11-23T03:12:18.1028848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1029092Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1029449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1029569Z getattr(self, test_name)() 2022-11-23T03:12:18.1029925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1030074Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1030417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1030506Z fn() 2022-11-23T03:12:18.1030851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1031005Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1031378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1031490Z _lazy_init(state, module) 2022-11-23T03:12:18.1031844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1031987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1032320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1032454Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1032828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1032938Z output = model(*input) 2022-11-23T03:12:18.1033288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1033458Z test(self, **param_kwargs) 2022-11-23T03:12:18.1033829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1033940Z output = model(*input) 2022-11-23T03:12:18.1034259Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1034382Z return func(*args, **kwargs) 2022-11-23T03:12:18.1034705Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1034843Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1035197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1035314Z return func(*args, **kwargs) 2022-11-23T03:12:18.1035628Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1035768Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1036181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1036283Z p_assert( 2022-11-23T03:12:18.1036656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1036826Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1037155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1037276Z traceback.print_stack() 2022-11-23T03:12:18.1037639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1037813Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1038045Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1038202Z self.run_subtests( 2022-11-23T03:12:18.1038568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1038683Z _lazy_init(state, module) 2022-11-23T03:12:18.1039048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1039159Z _lazy_init(state, module) 2022-11-23T03:12:18.1039506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1039659Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1039991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1040133Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1040476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1040613Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1041011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1041161Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1041492Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1041609Z return func(*args, **kwargs) 2022-11-23T03:12:18.1041928Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1042050Z return func(*args, **kwargs) 2022-11-23T03:12:18.1042421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1042536Z output = model(*input) 2022-11-23T03:12:18.1042918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1043019Z p_assert( 2022-11-23T03:12:18.1043396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1043488Z p_assert( 2022-11-23T03:12:18.1043792Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1043931Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1044261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1044386Z traceback.print_stack() 2022-11-23T03:12:18.1044711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1044833Z traceback.print_stack() 2022-11-23T03:12:18.1045198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1045465Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1045825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1045938Z _lazy_init(state, module) 2022-11-23T03:12:18.1046287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1046429Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1046761Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1046884Z return func(*args, **kwargs) 2022-11-23T03:12:18.1047257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1047352Z p_assert( 2022-11-23T03:12:18.1047733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1047856Z traceback.print_stack() 2022-11-23T03:12:18.1048088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1048322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1048553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1048784Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1048910Z File "", line 1, in 2022-11-23T03:12:18.1049120Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1049243Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1049442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1049596Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1049801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1049905Z self.run() 2022-11-23T03:12:18.1050104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1050251Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1050574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1050705Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1051065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1051191Z getattr(self, test_name)() 2022-11-23T03:12:18.1051552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1051646Z fn() 2022-11-23T03:12:18.1052014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1052119Z test(self, **param_kwargs) 2022-11-23T03:12:18.1052466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1052587Z return func(*args, **kwargs) 2022-11-23T03:12:18.1052828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1052940Z self.run_subtests( 2022-11-23T03:12:18.1053283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1053444Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1053805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1053939Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1054359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1054484Z output = model(*input) 2022-11-23T03:12:18.1054809Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1054946Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1055317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1055484Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1055846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1055948Z _lazy_init(state, module) 2022-11-23T03:12:18.1056291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1056478Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1056815Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1056938Z return func(*args, **kwargs) 2022-11-23T03:12:18.1057310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1057410Z p_assert( 2022-11-23T03:12:18.1057744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1057852Z traceback.print_stack() 2022-11-23T03:12:18.1057975Z File "", line 1, in 2022-11-23T03:12:18.1058184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1058324Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1058525Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1058673Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1058885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1058971Z self.run() 2022-11-23T03:12:18.1059169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1059308Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1059649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1059778Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1060131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1060252Z getattr(self, test_name)() 2022-11-23T03:12:18.1060607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1060686Z fn() 2022-11-23T03:12:18.1061048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1061168Z test(self, **param_kwargs) 2022-11-23T03:12:18.1061515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1061631Z return func(*args, **kwargs) 2022-11-23T03:12:18.1061876Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1061981Z self.run_subtests( 2022-11-23T03:12:18.1062337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1062480Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1062839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1062987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1063404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1063529Z output = model(*input) 2022-11-23T03:12:18.1064138Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1064407Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1065138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1065441Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1066146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1066377Z _lazy_init(state, module) 2022-11-23T03:12:18.1066723Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1066944Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1067286Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1067404Z return func(*args, **kwargs) 2022-11-23T03:12:18.1067781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1067864Z p_assert( 2022-11-23T03:12:18.1068192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1068350Z traceback.print_stack() 2022-11-23T03:12:18.1068485Z File "", line 1, in 2022-11-23T03:12:18.1068698Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1068840Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1069039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1069197Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1069396Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1069500Z self.run() 2022-11-23T03:12:18.1069625Z File "", line 1, in 2022-11-23T03:12:18.1069828Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1069964Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1070305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1070435Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1070627Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1070767Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1071123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1071247Z getattr(self, test_name)() 2022-11-23T03:12:18.1071444Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1071591Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1071944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1072044Z fn() 2022-11-23T03:12:18.1072239Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1072339Z self.run() 2022-11-23T03:12:18.1072701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1072819Z test(self, **param_kwargs) 2022-11-23T03:12:18.1073021Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1073160Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1073589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1073718Z return func(*args, **kwargs) 2022-11-23T03:12:18.1074041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1074165Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1074415Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1074521Z self.run_subtests( 2022-11-23T03:12:18.1074876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1074992Z getattr(self, test_name)() 2022-11-23T03:12:18.1075336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1075495Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1075893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1075983Z fn() 2022-11-23T03:12:18.1076338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1076484Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1076840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1076956Z test(self, **param_kwargs) 2022-11-23T03:12:18.1077326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1077437Z output = model(*input) 2022-11-23T03:12:18.1077775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1077995Z return func(*args, **kwargs) 2022-11-23T03:12:18.1078323Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1078457Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1078703Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1078812Z self.run_subtests( 2022-11-23T03:12:18.1079184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1079356Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1079691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1079844Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1080208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1080326Z _lazy_init(state, module) 2022-11-23T03:12:18.1080690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1080836Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1081186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1081325Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1081681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1081797Z output = model(*input) 2022-11-23T03:12:18.1082130Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1082251Z return func(*args, **kwargs) 2022-11-23T03:12:18.1082569Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1082710Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1083169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1083277Z p_assert( 2022-11-23T03:12:18.1083643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1083813Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1084148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1084269Z traceback.print_stack() 2022-11-23T03:12:18.1084630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1084744Z _lazy_init(state, module) 2022-11-23T03:12:18.1085089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1085281Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1085602Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1085723Z return func(*args, **kwargs) 2022-11-23T03:12:18.1086093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1086189Z p_assert( 2022-11-23T03:12:18.1086520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1086638Z traceback.print_stack() 2022-11-23T03:12:18.1086868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1087102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1087313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1087548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1087678Z File "", line 1, in 2022-11-23T03:12:18.1087801Z File "", line 1, in 2022-11-23T03:12:18.1088008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1088144Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1088271Z File "", line 1, in 2022-11-23T03:12:18.1088456Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1088602Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1088803Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1088977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1089103Z File "", line 1, in 2022-11-23T03:12:18.1089309Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1089453Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1089662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1089748Z self.run() 2022-11-23T03:12:18.1089953Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1090084Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1090278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1090424Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1090619Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1090760Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1090939Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1091082Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1091328Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1091516Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1091719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1091819Z self.run() 2022-11-23T03:12:18.1092022Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1092116Z self.run() 2022-11-23T03:12:18.1092443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1092573Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1092782Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1092879Z self.run() 2022-11-23T03:12:18.1093075Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1093214Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1093463Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1093587Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1093949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1094069Z getattr(self, test_name)() 2022-11-23T03:12:18.1094405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1094538Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1094734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1094876Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1095231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1095334Z getattr(self, test_name)() 2022-11-23T03:12:18.1095677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1095804Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1096165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1096255Z fn() 2022-11-23T03:12:18.1096587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1096715Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1097065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1097144Z fn() 2022-11-23T03:12:18.1097495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1097610Z getattr(self, test_name)() 2022-11-23T03:12:18.1097963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1098089Z getattr(self, test_name)() 2022-11-23T03:12:18.1098449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1098568Z test(self, **param_kwargs) 2022-11-23T03:12:18.1098929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1099033Z test(self, **param_kwargs) 2022-11-23T03:12:18.1099383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1099477Z fn() 2022-11-23T03:12:18.1099827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1099921Z fn() 2022-11-23T03:12:18.1100273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1100445Z return func(*args, **kwargs) 2022-11-23T03:12:18.1100802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1100920Z test(self, **param_kwargs) 2022-11-23T03:12:18.1101272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1101390Z return func(*args, **kwargs) 2022-11-23T03:12:18.1101750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1101863Z test(self, **param_kwargs) 2022-11-23T03:12:18.1102109Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1102218Z self.run_subtests( 2022-11-23T03:12:18.1102554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1102730Z return func(*args, **kwargs) 2022-11-23T03:12:18.1102978Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1103088Z self.run_subtests( 2022-11-23T03:12:18.1103442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1103562Z return func(*args, **kwargs) 2022-11-23T03:12:18.1104231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1104539Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1105198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1105494Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1106187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1106345Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1106594Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1106703Z self.run_subtests( 2022-11-23T03:12:18.1106943Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1107052Z self.run_subtests( 2022-11-23T03:12:18.1107405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1107555Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1107924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1108043Z output = model(*input) 2022-11-23T03:12:18.1108399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1108560Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1108898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1109051Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1109412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1109547Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1109917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1110031Z output = model(*input) 2022-11-23T03:12:18.1110353Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1110490Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1110990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1111151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1111527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1111686Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1112057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1112176Z output = model(*input) 2022-11-23T03:12:18.1112497Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1112635Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1112956Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1113165Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1113539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1113639Z output = model(*input) 2022-11-23T03:12:18.1114003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1114120Z _lazy_init(state, module) 2022-11-23T03:12:18.1114490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1114662Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1115034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1115199Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1115532Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1115654Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1115999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1116139Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1116503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1116622Z _lazy_init(state, module) 2022-11-23T03:12:18.1116985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1117103Z _lazy_init(state, module) 2022-11-23T03:12:18.1117449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1117577Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1117953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1118123Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1118460Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1118578Z return func(*args, **kwargs) 2022-11-23T03:12:18.1118938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1119052Z _lazy_init(state, module) 2022-11-23T03:12:18.1119397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1119518Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1119847Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1119973Z return func(*args, **kwargs) 2022-11-23T03:12:18.1120393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1120500Z p_assert( 2022-11-23T03:12:18.1120870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1120967Z p_assert( 2022-11-23T03:12:18.1121314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1121437Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1121772Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1121890Z return func(*args, **kwargs) 2022-11-23T03:12:18.1122223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1122393Z traceback.print_stack() 2022-11-23T03:12:18.1122728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1122850Z traceback.print_stack() 2022-11-23T03:12:18.1123183Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1123288Z return func(*args, **kwargs) 2022-11-23T03:12:18.1123659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1123759Z p_assert( 2022-11-23T03:12:18.1124124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1124222Z p_assert( 2022-11-23T03:12:18.1124549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1124671Z traceback.print_stack() 2022-11-23T03:12:18.1125001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1125107Z traceback.print_stack() 2022-11-23T03:12:18.1125341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1125574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1125803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1126032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1126158Z File "", line 1, in 2022-11-23T03:12:18.1126366Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1126504Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1126688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1126838Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1127054Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1127155Z self.run() 2022-11-23T03:12:18.1127358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1127505Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1127845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1127960Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1128318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1128437Z getattr(self, test_name)() 2022-11-23T03:12:18.1128795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1128894Z fn() 2022-11-23T03:12:18.1129299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1129425Z test(self, **param_kwargs) 2022-11-23T03:12:18.1129776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1129882Z return func(*args, **kwargs) 2022-11-23T03:12:18.1130128Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1130240Z self.run_subtests( 2022-11-23T03:12:18.1130590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1130749Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1130875Z File "", line 1, in 2022-11-23T03:12:18.1131237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1131440Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1131799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1131916Z output = model(*input) 2022-11-23T03:12:18.1132239Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1132377Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1132584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1132723Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1133095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1133270Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1133618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1133743Z _lazy_init(state, module) 2022-11-23T03:12:18.1133948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1134097Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1134447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1134587Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1134923Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1135046Z return func(*args, **kwargs) 2022-11-23T03:12:18.1135157Z File "", line 1, in 2022-11-23T03:12:18.1135459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1135558Z self.run() 2022-11-23T03:12:18.1135937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1136037Z p_assert( 2022-11-23T03:12:18.1136240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1136380Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1136564Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1136709Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1137045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1137170Z traceback.print_stack() 2022-11-23T03:12:18.1137367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1137515Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1137847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1137980Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1138224Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1138332Z self.run() 2022-11-23T03:12:18.1138534Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1138677Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1139037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1139156Z getattr(self, test_name)() 2022-11-23T03:12:18.1139492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1139621Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1139961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1140080Z getattr(self, test_name)() 2022-11-23T03:12:18.1140490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1140586Z fn() 2022-11-23T03:12:18.1140938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1141034Z fn() 2022-11-23T03:12:18.1141395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1141499Z test(self, **param_kwargs) 2022-11-23T03:12:18.1141857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1141977Z test(self, **param_kwargs) 2022-11-23T03:12:18.1142323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1142441Z return func(*args, **kwargs) 2022-11-23T03:12:18.1142699Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1142810Z self.run_subtests( 2022-11-23T03:12:18.1143165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1143270Z return func(*args, **kwargs) 2022-11-23T03:12:18.1143619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1143777Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1144164Z File "", line 1, in 2022-11-23T03:12:18.1144878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1145158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1145624Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1145833Z self.run_subtests( 2022-11-23T03:12:18.1146550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1146680Z output = model(*input) 2022-11-23T03:12:18.1146889Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1147030Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1147362Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1147500Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1147850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1148002Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1148186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1148342Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1148806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1148994Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1149355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1149503Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1149714Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1149818Z self.run() 2022-11-23T03:12:18.1150162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1150278Z _lazy_init(state, module) 2022-11-23T03:12:18.1150651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1150825Z output = model(*input) 2022-11-23T03:12:18.1151030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1151173Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1151528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1151668Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1151978Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1152116Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1152446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1152576Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1152908Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1153033Z return func(*args, **kwargs) 2022-11-23T03:12:18.1153411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1153585Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1153926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1154046Z getattr(self, test_name)() 2022-11-23T03:12:18.1154421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1154516Z p_assert( 2022-11-23T03:12:18.1154879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1154994Z _lazy_init(state, module) 2022-11-23T03:12:18.1155349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1155443Z fn() 2022-11-23T03:12:18.1155763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1155887Z traceback.print_stack() 2022-11-23T03:12:18.1156233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1156375Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1156737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1156858Z test(self, **param_kwargs) 2022-11-23T03:12:18.1157190Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1157311Z return func(*args, **kwargs) 2022-11-23T03:12:18.1157649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1157772Z return func(*args, **kwargs) 2022-11-23T03:12:18.1158195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1158298Z p_assert( 2022-11-23T03:12:18.1158546Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1158661Z self.run_subtests( 2022-11-23T03:12:18.1159078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1159198Z traceback.print_stack() 2022-11-23T03:12:18.1159532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1159691Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1160046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1160254Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1160631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1160749Z output = model(*input) 2022-11-23T03:12:18.1161069Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1161210Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1161569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1161745Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1162108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1162227Z _lazy_init(state, module) 2022-11-23T03:12:18.1162573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1162722Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1163057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1163181Z return func(*args, **kwargs) 2022-11-23T03:12:18.1163541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1163638Z p_assert( 2022-11-23T03:12:18.1163969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1164090Z traceback.print_stack() 2022-11-23T03:12:18.1164322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1164556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1164788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1165020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1165132Z File "", line 1, in 2022-11-23T03:12:18.1165341Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1165479Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1165682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1165831Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1166040Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1166142Z self.run() 2022-11-23T03:12:18.1166325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1166471Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1166809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1166994Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1167364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1167484Z getattr(self, test_name)() 2022-11-23T03:12:18.1167842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1167941Z fn() 2022-11-23T03:12:18.1168290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1168456Z test(self, **param_kwargs) 2022-11-23T03:12:18.1168810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1168932Z return func(*args, **kwargs) 2022-11-23T03:12:18.1169180Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1169345Z self.run_subtests( 2022-11-23T03:12:18.1169472Z File "", line 1, in 2022-11-23T03:12:18.1169819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1169964Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1170324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1170473Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1170675Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1170814Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1171186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1171309Z output = model(*input) 2022-11-23T03:12:18.1171510Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1171644Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1171972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1172109Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1172321Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1172423Z self.run() 2022-11-23T03:12:18.1172799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1172972Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1173172Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1173300Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1173673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1173790Z _lazy_init(state, module) 2022-11-23T03:12:18.1174130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1174262Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1174613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1174754Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1175118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1175223Z getattr(self, test_name)() 2022-11-23T03:12:18.1175557Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1175686Z return func(*args, **kwargs) 2022-11-23T03:12:18.1176094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1176200Z fn() 2022-11-23T03:12:18.1176583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1176687Z p_assert( 2022-11-23T03:12:18.1177029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1177155Z test(self, **param_kwargs) 2022-11-23T03:12:18.1177489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1177611Z traceback.print_stack() 2022-11-23T03:12:18.1177965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1178083Z return func(*args, **kwargs) 2022-11-23T03:12:18.1178329Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1178492Z self.run_subtests( 2022-11-23T03:12:18.1178828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1178988Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1179349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1179501Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1179873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1179990Z output = model(*input) 2022-11-23T03:12:18.1180313Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1180450Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1180567Z File "", line 1, in 2022-11-23T03:12:18.1180944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1181121Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1181482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1181601Z _lazy_init(state, module) 2022-11-23T03:12:18.1181809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1181950Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1182301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1182426Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1182554Z File "", line 1, in 2022-11-23T03:12:18.1182757Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1182919Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1183258Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1183380Z return func(*args, **kwargs) 2022-11-23T03:12:18.1183590Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1183691Z self.run() 2022-11-23T03:12:18.1184273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1184552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1185295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1185479Z p_assert( 2022-11-23T03:12:18.1185856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1186129Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1186460Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1186605Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1186950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1187073Z traceback.print_stack() 2022-11-23T03:12:18.1187408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1187539Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1187750Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1187852Z self.run() 2022-11-23T03:12:18.1188211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1188316Z getattr(self, test_name)() 2022-11-23T03:12:18.1188629Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1188778Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1189138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1189234Z fn() 2022-11-23T03:12:18.1189565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1189692Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1190059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1190163Z test(self, **param_kwargs) 2022-11-23T03:12:18.1190519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1190639Z getattr(self, test_name)() 2022-11-23T03:12:18.1190996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1191126Z return func(*args, **kwargs) 2022-11-23T03:12:18.1191480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1191577Z fn() 2022-11-23T03:12:18.1191826Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1191922Z self.run_subtests( 2022-11-23T03:12:18.1192286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1192407Z test(self, **param_kwargs) 2022-11-23T03:12:18.1192755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1192918Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1193269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1193399Z return func(*args, **kwargs) 2022-11-23T03:12:18.1193764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1193900Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1194151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1194262Z self.run_subtests( 2022-11-23T03:12:18.1194633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1194750Z output = model(*input) 2022-11-23T03:12:18.1195095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1195254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1195579Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1195751Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1196124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1196276Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1196646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1196820Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1197193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1197310Z output = model(*input) 2022-11-23T03:12:18.1197674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1197831Z _lazy_init(state, module) 2022-11-23T03:12:18.1198161Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1198301Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1198650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1198791Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1199163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1199341Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1199673Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1199778Z return func(*args, **kwargs) 2022-11-23T03:12:18.1200139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1200261Z _lazy_init(state, module) 2022-11-23T03:12:18.1200637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1200736Z p_assert( 2022-11-23T03:12:18.1201085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1201225Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1201556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1201663Z traceback.print_stack() 2022-11-23T03:12:18.1201997Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1202114Z return func(*args, **kwargs) 2022-11-23T03:12:18.1202484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1202588Z p_assert( 2022-11-23T03:12:18.1202918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1203045Z traceback.print_stack() 2022-11-23T03:12:18.1203279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1203497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1203728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1203956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1204084Z File "", line 1, in 2022-11-23T03:12:18.1204292Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1204432Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1204633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1204832Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1205032Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1205137Z self.run() 2022-11-23T03:12:18.1205339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1205484Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1205610Z File "", line 1, in 2022-11-23T03:12:18.1205954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1206087Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1206432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1206553Z getattr(self, test_name)() 2022-11-23T03:12:18.1206762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1206961Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1207321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1207418Z fn() 2022-11-23T03:12:18.1207617Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1207764Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1208112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1208234Z test(self, **param_kwargs) 2022-11-23T03:12:18.1208444Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1208547Z self.run() 2022-11-23T03:12:18.1208905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1209032Z return func(*args, **kwargs) 2022-11-23T03:12:18.1209235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1209363Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1209614Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1209725Z self.run_subtests( 2022-11-23T03:12:18.1210078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1210238Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1210598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1210747Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1211149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1211267Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1211643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1211761Z output = model(*input) 2022-11-23T03:12:18.1212120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1212242Z getattr(self, test_name)() 2022-11-23T03:12:18.1212561Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1212700Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1213050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1213128Z fn() 2022-11-23T03:12:18.1213500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1213676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1214099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1214226Z test(self, **param_kwargs) 2022-11-23T03:12:18.1214592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1214711Z _lazy_init(state, module) 2022-11-23T03:12:18.1215068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1215172Z return func(*args, **kwargs) 2022-11-23T03:12:18.1215521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1215661Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1215908Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1216066Z self.run_subtests( 2022-11-23T03:12:18.1216407Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1216529Z return func(*args, **kwargs) 2022-11-23T03:12:18.1216879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1217022Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1217399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1217500Z p_assert( 2022-11-23T03:12:18.1217862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1218013Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1218346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1218477Z traceback.print_stack() 2022-11-23T03:12:18.1218854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1218954Z output = model(*input) 2022-11-23T03:12:18.1219084Z File "", line 1, in 2022-11-23T03:12:18.1219407Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1219546Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1219922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1220095Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1220303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1220442Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1220800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1220924Z _lazy_init(state, module) 2022-11-23T03:12:18.1221132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1221284Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1221637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1221782Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1221997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1222103Z self.run() 2022-11-23T03:12:18.1222422Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1222549Z return func(*args, **kwargs) 2022-11-23T03:12:18.1222755Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1222958Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1223345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1223450Z p_assert( 2022-11-23T03:12:18.1223787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1224226Z traceback.print_stack() 2022-11-23T03:12:18.1224873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1225127Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1225836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1226065Z getattr(self, test_name)() 2022-11-23T03:12:18.1226476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1226660Z fn() 2022-11-23T03:12:18.1227032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1227159Z test(self, **param_kwargs) 2022-11-23T03:12:18.1227494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1227624Z return func(*args, **kwargs) 2022-11-23T03:12:18.1227877Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1227991Z self.run_subtests( 2022-11-23T03:12:18.1228341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1228501Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1228863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1229020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1229379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1229499Z output = model(*input) 2022-11-23T03:12:18.1229822Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1229963Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1230336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1230514Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1230880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1231004Z _lazy_init(state, module) 2022-11-23T03:12:18.1231343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1231488Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1231830Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1231959Z return func(*args, **kwargs) 2022-11-23T03:12:18.1232334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1232440Z p_assert( 2022-11-23T03:12:18.1232571Z File "", line 1, in 2022-11-23T03:12:18.1232889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1233016Z traceback.print_stack() 2022-11-23T03:12:18.1233229Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1233375Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1233647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1233810Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1234028Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1234132Z self.run() 2022-11-23T03:12:18.1234318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1234464Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1234808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1234942Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1235423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1235553Z getattr(self, test_name)() 2022-11-23T03:12:18.1235916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1236069Z fn() 2022-11-23T03:12:18.1236425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1236551Z test(self, **param_kwargs) 2022-11-23T03:12:18.1236909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1237039Z return func(*args, **kwargs) 2022-11-23T03:12:18.1237292Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1237408Z self.run_subtests( 2022-11-23T03:12:18.1237766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1237930Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1238275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1238435Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1238815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1238937Z output = model(*input) 2022-11-23T03:12:18.1239266Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1239409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1239784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1239956Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1240304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1240423Z _lazy_init(state, module) 2022-11-23T03:12:18.1240778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1240925Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1241264Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1241390Z return func(*args, **kwargs) 2022-11-23T03:12:18.1241772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1241880Z p_assert( 2022-11-23T03:12:18.1242198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1242325Z traceback.print_stack() 2022-11-23T03:12:18.1242565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1242805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1243094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1243335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1243565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1243795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1244005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1244239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1244469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1244694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1244922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1245199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1245427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1245659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1245868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1246094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1246324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1246557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1246783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1247004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1247237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1247463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1247690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1247896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1248917Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.1249159Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.1250157Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.1250392Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.1251427Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.1251673Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.1252672Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.1252902Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.1253139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1253377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1253660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1253892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1254122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1254329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1254561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1254790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1255018Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1255246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1255476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1255705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1255934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1256143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1256369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1256593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1256822Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1257049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1257279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1257511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1257734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1257962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1258170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1258397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1258622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1258846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1259068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1259295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1259575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1259806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1260012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1260232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1260454Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1260679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1260902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1261122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1261342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1261658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1262116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1262362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1262585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1262809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1263032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1263255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1263477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1263702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1264259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1264683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1265101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1265533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1265956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1266345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1266575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1266836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1267068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1267271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1267496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1267719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1267943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1268166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1268398Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1268652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1268876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1269199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1269415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1269639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1269864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1270089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1270311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1270532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1270752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1270974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1271133Z dist init r=3, world=4 2022-11-23T03:12:18.1271467Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1271788Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1272115Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1272429Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1272736Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1273068Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1273383Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1273689Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1274012Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1274326Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1274622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1274939Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.1275051Z dist init r=2, world=4 2022-11-23T03:12:18.1275359Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1275664Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1275975Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1276345Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1276655Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1276973Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1277279Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1277586Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1277903Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1278248Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1278552Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1278865Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.1278978Z dist init r=1, world=4 2022-11-23T03:12:18.1279287Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1279591Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1279907Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1280212Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1280514Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1280827Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1281134Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1281430Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1281742Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1282049Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1282349Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1282661Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.1282776Z dist init r=0, world=4 2022-11-23T03:12:18.1283122Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1283429Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1283740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1284046Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1284350Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1284646Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1285003Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1285306Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1285617Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1285923Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1286227Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1286542Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.1286645Z ok (36.390s) 2022-11-23T03:12:18.1286864Z test_delayed_optim_step_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T03:12:18.1287176Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11997 2022-11-23T03:12:18.1287393Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11998 2022-11-23T03:12:18.1287592Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11999 2022-11-23T03:12:18.1287804Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12000 2022-11-23T03:12:18.1288211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.1288394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.1288778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.1288970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.1289335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.1289506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.1289866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.1290054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.1290413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.1290586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.1291005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.1291198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.1291563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.1291733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.1292100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.1292268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.1292512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.1292757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.1293051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.1293283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.1293682Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.1294077Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.1294471Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.1294862Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.1295074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.1295309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.1295534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.1295750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.1295984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1296213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1296443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1296673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1297696Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.1297812Z warnings.warn( 2022-11-23T03:12:18.1298795Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.1298905Z warnings.warn( 2022-11-23T03:12:18.1299936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.1300054Z warnings.warn( 2022-11-23T03:12:18.1301046Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.1301154Z warnings.warn( 2022-11-23T03:12:18.1301282Z File "", line 1, in 2022-11-23T03:12:18.1301455Z File "", line 1, in 2022-11-23T03:12:18.1301583Z File "", line 1, in 2022-11-23T03:12:18.1301705Z File "", line 1, in 2022-11-23T03:12:18.1301918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1302042Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1302253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1302390Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1302596Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1302731Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1302937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1303076Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1303261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1303417Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1303617Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1303763Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1304324Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1304608Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1304980Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1305252Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1305627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1305816Z self.run() 2022-11-23T03:12:18.1306168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1306274Z self.run() 2022-11-23T03:12:18.1306486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1306596Z self.run() 2022-11-23T03:12:18.1306802Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1306886Z self.run() 2022-11-23T03:12:18.1307083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1307229Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1307429Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1307570Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1307768Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1307908Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1308109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1308235Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1308684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1308828Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1309165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1309295Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1309629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1309759Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1310087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1310198Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1310561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1310684Z getattr(self, test_name)() 2022-11-23T03:12:18.1311170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1311291Z getattr(self, test_name)() 2022-11-23T03:12:18.1311650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1311747Z fn() 2022-11-23T03:12:18.1312103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1312207Z getattr(self, test_name)() 2022-11-23T03:12:18.1312563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1312683Z getattr(self, test_name)() 2022-11-23T03:12:18.1313038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1313134Z fn() 2022-11-23T03:12:18.1313499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1313627Z test(self, **param_kwargs) 2022-11-23T03:12:18.1313966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1314060Z fn() 2022-11-23T03:12:18.1314411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1314503Z fn() 2022-11-23T03:12:18.1314860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1314980Z test(self, **param_kwargs) 2022-11-23T03:12:18.1315327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1315451Z return func(*args, **kwargs) 2022-11-23T03:12:18.1315797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1315924Z test(self, **param_kwargs) 2022-11-23T03:12:18.1316289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1316410Z test(self, **param_kwargs) 2022-11-23T03:12:18.1316764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1316885Z return func(*args, **kwargs) 2022-11-23T03:12:18.1317137Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1317250Z self.run_subtests( 2022-11-23T03:12:18.1317584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1317704Z return func(*args, **kwargs) 2022-11-23T03:12:18.1318056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1318233Z return func(*args, **kwargs) 2022-11-23T03:12:18.1318489Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1318601Z self.run_subtests( 2022-11-23T03:12:18.1318845Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1318956Z self.run_subtests( 2022-11-23T03:12:18.1319291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1319451Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1319696Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1319806Z self.run_subtests( 2022-11-23T03:12:18.1320149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1320355Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1320709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1320867Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1321212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1321362Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1321708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1321864Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1322237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1322359Z output = model(*input) 2022-11-23T03:12:18.1322727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1322875Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1323214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1323366Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1323718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1323864Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1324184Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1324323Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1324693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1324813Z output = model(*input) 2022-11-23T03:12:18.1325174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1325290Z output = model(*input) 2022-11-23T03:12:18.1325659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1325773Z output = model(*input) 2022-11-23T03:12:18.1326150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1326326Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1326650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1326788Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1327090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1327314Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1327639Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1327777Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1328143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1328261Z _lazy_init(state, module) 2022-11-23T03:12:18.1328634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1328807Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1329216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1329373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1329793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1329913Z _lazy_init(state, module) 2022-11-23T03:12:18.1330282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1330454Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1330804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1330944Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1331307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1331409Z _lazy_init(state, module) 2022-11-23T03:12:18.1331757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1331901Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1332266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1332382Z _lazy_init(state, module) 2022-11-23T03:12:18.1332717Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1332841Z return func(*args, **kwargs) 2022-11-23T03:12:18.1333168Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1333273Z return func(*args, **kwargs) 2022-11-23T03:12:18.1333618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1333756Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1334133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1334238Z p_assert( 2022-11-23T03:12:18.1334588Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1334725Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1335059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1335167Z traceback.print_stack() 2022-11-23T03:12:18.1335543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1335644Z p_assert( 2022-11-23T03:12:18.1335973Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1336095Z return func(*args, **kwargs) 2022-11-23T03:12:18.1336431Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1336555Z return func(*args, **kwargs) 2022-11-23T03:12:18.1336950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1337066Z traceback.print_stack() 2022-11-23T03:12:18.1337445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1337544Z p_assert( 2022-11-23T03:12:18.1337920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1338019Z p_assert( 2022-11-23T03:12:18.1338349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1338471Z traceback.print_stack() 2022-11-23T03:12:18.1338784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1338906Z traceback.print_stack() 2022-11-23T03:12:18.1339199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1339432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1339663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1339893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1340022Z File "", line 1, in 2022-11-23T03:12:18.1340234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1340357Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1340558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1340707Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1340920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1341028Z self.run() 2022-11-23T03:12:18.1341233Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1341378Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1341722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1341836Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1342203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1342327Z getattr(self, test_name)() 2022-11-23T03:12:18.1342683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1342781Z fn() 2022-11-23T03:12:18.1343144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1343264Z test(self, **param_kwargs) 2022-11-23T03:12:18.1343622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1343730Z return func(*args, **kwargs) 2022-11-23T03:12:18.1344378Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1344591Z self.run_subtests( 2022-11-23T03:12:18.1345278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1345572Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1346272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1346546Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1346972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1347080Z output = model(*input) 2022-11-23T03:12:18.1347478Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1347631Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1348010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1348185Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1348549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1348670Z _lazy_init(state, module) 2022-11-23T03:12:18.1349017Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1349143Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1349479Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1349667Z return func(*args, **kwargs) 2022-11-23T03:12:18.1350053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1350154Z p_assert( 2022-11-23T03:12:18.1350487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1350612Z traceback.print_stack() 2022-11-23T03:12:18.1350742Z File "", line 1, in 2022-11-23T03:12:18.1350937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1351079Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1351205Z File "", line 1, in 2022-11-23T03:12:18.1351405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1351557Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1351772Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1351877Z self.run() 2022-11-23T03:12:18.1352070Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1352209Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1352413Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1352556Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1352756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1352902Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1353246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1353380Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1353574Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1353680Z self.run() 2022-11-23T03:12:18.1354043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1354166Z getattr(self, test_name)() 2022-11-23T03:12:18.1354370Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1354516Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1354875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1354956Z fn() 2022-11-23T03:12:18.1355293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1355424Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1355791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1355912Z test(self, **param_kwargs) 2022-11-23T03:12:18.1356321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1356448Z getattr(self, test_name)() 2022-11-23T03:12:18.1356807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1356913Z return func(*args, **kwargs) 2022-11-23T03:12:18.1357265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1357361Z fn() 2022-11-23T03:12:18.1357610Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1357723Z self.run_subtests( 2022-11-23T03:12:18.1358086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1358207Z test(self, **param_kwargs) 2022-11-23T03:12:18.1358555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1358750Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1359106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1359228Z return func(*args, **kwargs) 2022-11-23T03:12:18.1359590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1359740Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1359985Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1360097Z self.run_subtests( 2022-11-23T03:12:18.1360470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1360572Z output = model(*input) 2022-11-23T03:12:18.1360930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1361090Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1361415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1361555Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1361914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1362063Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1362438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1362596Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1362971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1363094Z output = model(*input) 2022-11-23T03:12:18.1363461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1363581Z _lazy_init(state, module) 2022-11-23T03:12:18.1363905Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1364043Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1364393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1364519Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1364895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1365066Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1365401Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1365573Z return func(*args, **kwargs) 2022-11-23T03:12:18.1365948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1366066Z _lazy_init(state, module) 2022-11-23T03:12:18.1366444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1366529Z p_assert( 2022-11-23T03:12:18.1366926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1367068Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1367410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1367534Z traceback.print_stack() 2022-11-23T03:12:18.1367663Z File "", line 1, in 2022-11-23T03:12:18.1368057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1368181Z return func(*args, **kwargs) 2022-11-23T03:12:18.1368592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1368696Z p_assert( 2022-11-23T03:12:18.1368907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1369049Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1369384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1369508Z traceback.print_stack() 2022-11-23T03:12:18.1369709Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1369859Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1370058Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1370167Z self.run() 2022-11-23T03:12:18.1370372Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1370515Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1370850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1370981Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1371343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1371448Z getattr(self, test_name)() 2022-11-23T03:12:18.1371809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1371905Z fn() 2022-11-23T03:12:18.1372268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1372393Z test(self, **param_kwargs) 2022-11-23T03:12:18.1372744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1372868Z return func(*args, **kwargs) 2022-11-23T03:12:18.1373116Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1373211Z self.run_subtests( 2022-11-23T03:12:18.1373565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1373726Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1374090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1374241Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1374614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1374734Z output = model(*input) 2022-11-23T03:12:18.1375105Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1375234Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1375614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1375791Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1376156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1376278Z _lazy_init(state, module) 2022-11-23T03:12:18.1376625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1376766Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1377102Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1377259Z return func(*args, **kwargs) 2022-11-23T03:12:18.1377640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1377739Z p_assert( 2022-11-23T03:12:18.1378075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1378197Z traceback.print_stack() 2022-11-23T03:12:18.1378434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1378669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1378899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1379114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1379247Z File "", line 1, in 2022-11-23T03:12:18.1379460Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1379604Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1379807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1379957Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1380168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1380270Z self.run() 2022-11-23T03:12:18.1380456Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1380600Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1380939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1381070Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1381432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1381666Z getattr(self, test_name)() 2022-11-23T03:12:18.1382076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1382201Z fn() 2022-11-23T03:12:18.1382812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1382980Z test(self, **param_kwargs) 2022-11-23T03:12:18.1383380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1383489Z return func(*args, **kwargs) 2022-11-23T03:12:18.1383775Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1384393Z self.run_subtests( 2022-11-23T03:12:18.1385148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1385698Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1386397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1386657Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1387079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1387183Z output = model(*input) 2022-11-23T03:12:18.1387547Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1387723Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1388136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1388358Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1388849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1389009Z _lazy_init(state, module) 2022-11-23T03:12:18.1389439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1389569Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1389945Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1390106Z return func(*args, **kwargs) 2022-11-23T03:12:18.1390532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1390669Z p_assert( 2022-11-23T03:12:18.1391083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1391249Z traceback.print_stack() 2022-11-23T03:12:18.1391366Z File "", line 1, in 2022-11-23T03:12:18.1391621Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1391839Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1392082Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1392280Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1392535Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1392674Z self.run() 2022-11-23T03:12:18.1392912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1393040Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1393421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1393588Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1394022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1394204Z getattr(self, test_name)() 2022-11-23T03:12:18.1394603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1394739Z fn() 2022-11-23T03:12:18.1395144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1395251Z test(self, **param_kwargs) 2022-11-23T03:12:18.1395680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1395839Z return func(*args, **kwargs) 2022-11-23T03:12:18.1396128Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1396322Z self.run_subtests( 2022-11-23T03:12:18.1396719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1396989Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1397400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1397537Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1397949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1398106Z output = model(*input) 2022-11-23T03:12:18.1398465Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1398650Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1399101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1399315Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1399773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1399877Z _lazy_init(state, module) 2022-11-23T03:12:18.1400297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1400477Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1400863Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1401021Z return func(*args, **kwargs) 2022-11-23T03:12:18.1401440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1401612Z p_assert( 2022-11-23T03:12:18.1401992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1402099Z traceback.print_stack() 2022-11-23T03:12:18.1402268Z File "", line 1, in 2022-11-23T03:12:18.1402436Z File "", line 1, in 2022-11-23T03:12:18.1402695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1402874Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1403111Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1403296Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1403493Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1403703Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1403954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1404095Z self.run() 2022-11-23T03:12:18.1404372Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1404557Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1404801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1404984Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1405181Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1405321Z self.run() 2022-11-23T03:12:18.1405743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1405914Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1406162Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1406341Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1406745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1406851Z getattr(self, test_name)() 2022-11-23T03:12:18.1407224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1407448Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1407856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1408024Z fn() 2022-11-23T03:12:18.1408440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1408628Z getattr(self, test_name)() 2022-11-23T03:12:18.1409031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1409137Z test(self, **param_kwargs) 2022-11-23T03:12:18.1409531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1409663Z fn() 2022-11-23T03:12:18.1410056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1410276Z return func(*args, **kwargs) 2022-11-23T03:12:18.1410720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1410879Z test(self, **param_kwargs) 2022-11-23T03:12:18.1411225Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1411323Z self.run_subtests( 2022-11-23T03:12:18.1411725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1411884Z return func(*args, **kwargs) 2022-11-23T03:12:18.1412274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1412484Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1412768Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1412963Z self.run_subtests( 2022-11-23T03:12:18.1413401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1413538Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1413923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1414118Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1414526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1414688Z output = model(*input) 2022-11-23T03:12:18.1415090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1415277Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1415685Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1415810Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1416230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1416384Z output = model(*input) 2022-11-23T03:12:18.1416802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1417024Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1417387Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1417562Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1417962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1418071Z _lazy_init(state, module) 2022-11-23T03:12:18.1418601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1418816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1419204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1419396Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1419800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1419957Z _lazy_init(state, module) 2022-11-23T03:12:18.1420333Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1420494Z return func(*args, **kwargs) 2022-11-23T03:12:18.1420831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1421099Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1421533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1421674Z p_assert( 2022-11-23T03:12:18.1422048Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1422207Z return func(*args, **kwargs) 2022-11-23T03:12:18.1422576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1422685Z traceback.print_stack() 2022-11-23T03:12:18.1423096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1423262Z p_assert( 2022-11-23T03:12:18.1423676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1423840Z traceback.print_stack() 2022-11-23T03:12:18.1424648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1425147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1425638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1426055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1426266Z File "", line 1, in 2022-11-23T03:12:18.1426518Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1426718Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1427010Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1427197Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1427448Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1427595Z self.run() 2022-11-23T03:12:18.1427786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1427969Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1428366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1428581Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1428984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1429185Z getattr(self, test_name)() 2022-11-23T03:12:18.1429586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1429718Z fn() 2022-11-23T03:12:18.1429830Z File "", line 1, in 2022-11-23T03:12:18.1430232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1430481Z test(self, **param_kwargs) 2022-11-23T03:12:18.1430909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1431092Z return func(*args, **kwargs) 2022-11-23T03:12:18.1431335Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1431548Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1431786Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1431938Z self.run_subtests( 2022-11-23T03:12:18.1432174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1432359Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1432758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1433060Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1433313Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1433453Z self.run() 2022-11-23T03:12:18.1433806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1434030Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1434271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1434462Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1434626Z File "", line 1, in 2022-11-23T03:12:18.1435042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1435197Z output = model(*input) 2022-11-23T03:12:18.1435571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1435693Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1436056Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1436269Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1436528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1436709Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1437110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1437300Z getattr(self, test_name)() 2022-11-23T03:12:18.1437714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1437885Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1438123Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1438319Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1438766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1438901Z fn() 2022-11-23T03:12:18.1439308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1439465Z _lazy_init(state, module) 2022-11-23T03:12:18.1439712Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1439799Z self.run() 2022-11-23T03:12:18.1440194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1440354Z test(self, **param_kwargs) 2022-11-23T03:12:18.1440751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1440965Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1441264Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1441454Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1441802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1441992Z return func(*args, **kwargs) 2022-11-23T03:12:18.1442367Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1442525Z return func(*args, **kwargs) 2022-11-23T03:12:18.1442943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1443112Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1443397Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1443601Z self.run_subtests( 2022-11-23T03:12:18.1443970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1444112Z p_assert( 2022-11-23T03:12:18.1444509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1444682Z getattr(self, test_name)() 2022-11-23T03:12:18.1445069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1445303Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1445676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1445836Z traceback.print_stack() 2022-11-23T03:12:18.1446181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1446317Z fn() 2022-11-23T03:12:18.1446758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1446956Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1447122Z File "", line 1, in 2022-11-23T03:12:18.1447525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1447716Z test(self, **param_kwargs) 2022-11-23T03:12:18.1448131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1448233Z output = model(*input) 2022-11-23T03:12:18.1448622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1448780Z return func(*args, **kwargs) 2022-11-23T03:12:18.1449150Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1449332Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1449580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1449755Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1450077Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1450178Z self.run_subtests( 2022-11-23T03:12:18.1450592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1450802Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1451050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1451268Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1451656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1451934Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1452193Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1452281Z self.run() 2022-11-23T03:12:18.1452724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1452886Z _lazy_init(state, module) 2022-11-23T03:12:18.1453295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1453482Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1453725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1453905Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1454293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1454468Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1454886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1455089Z output = model(*input) 2022-11-23T03:12:18.1455471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1455670Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1456044Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1456259Z return func(*args, **kwargs) 2022-11-23T03:12:18.1456621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1456749Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1457145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1457316Z getattr(self, test_name)() 2022-11-23T03:12:18.1457770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1457909Z p_assert( 2022-11-23T03:12:18.1458328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1458543Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1459038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1459126Z fn() 2022-11-23T03:12:18.1459503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1459671Z traceback.print_stack() 2022-11-23T03:12:18.1460073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1460277Z _lazy_init(state, module) 2022-11-23T03:12:18.1460700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1460889Z test(self, **param_kwargs) 2022-11-23T03:12:18.1461274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1461403Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1461793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1461961Z return func(*args, **kwargs) 2022-11-23T03:12:18.1462335Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1462493Z return func(*args, **kwargs) 2022-11-23T03:12:18.1462811Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1463016Z self.run_subtests( 2022-11-23T03:12:18.1463393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1463638Z p_assert( 2022-11-23T03:12:18.1464467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1464839Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1465541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1465822Z traceback.print_stack() 2022-11-23T03:12:18.1466459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1466703Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1467073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1467351Z output = model(*input) 2022-11-23T03:12:18.1467728Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1467903Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1468316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1468554Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1468977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1469133Z _lazy_init(state, module) 2022-11-23T03:12:18.1469560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1469691Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1470083Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1470245Z return func(*args, **kwargs) 2022-11-23T03:12:18.1470663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1470801Z p_assert( 2022-11-23T03:12:18.1471171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1471331Z traceback.print_stack() 2022-11-23T03:12:18.1471554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1471827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1472170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1472443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1472613Z File "", line 1, in 2022-11-23T03:12:18.1472864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1473042Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1473278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1473411Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1473663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1473808Z self.run() 2022-11-23T03:12:18.1474078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1474265Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1474646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1474812Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1475285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1475399Z getattr(self, test_name)() 2022-11-23T03:12:18.1475800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1475943Z fn() 2022-11-23T03:12:18.1476387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1476582Z test(self, **param_kwargs) 2022-11-23T03:12:18.1476975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1477134Z return func(*args, **kwargs) 2022-11-23T03:12:18.1477420Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1477516Z self.run_subtests( 2022-11-23T03:12:18.1477901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1478163Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1478568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1478756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1479202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1479360Z output = model(*input) 2022-11-23T03:12:18.1479726Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1479850Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1480269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1480478Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1480893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1481089Z _lazy_init(state, module) 2022-11-23T03:12:18.1481476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1481688Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1482065Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1482171Z return func(*args, **kwargs) 2022-11-23T03:12:18.1482595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1482733Z p_assert( 2022-11-23T03:12:18.1483111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1483275Z traceback.print_stack() 2022-11-23T03:12:18.1483438Z File "", line 1, in 2022-11-23T03:12:18.1483688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1483901Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1484089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1484287Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1484536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1484673Z self.run() 2022-11-23T03:12:18.1484910Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1485127Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1485507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1485622Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1486072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1486280Z getattr(self, test_name)() 2022-11-23T03:12:18.1486679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1486811Z fn() 2022-11-23T03:12:18.1487216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1487373Z test(self, **param_kwargs) 2022-11-23T03:12:18.1487772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1487880Z return func(*args, **kwargs) 2022-11-23T03:12:18.1488162Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1488316Z self.run_subtests( 2022-11-23T03:12:18.1488798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1488997Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1489403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1489587Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1490033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1490134Z output = model(*input) 2022-11-23T03:12:18.1490518Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1490693Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1491104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1491355Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1491764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1491922Z _lazy_init(state, module) 2022-11-23T03:12:18.1492310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1492436Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1492818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1492975Z return func(*args, **kwargs) 2022-11-23T03:12:18.1493388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1493526Z p_assert( 2022-11-23T03:12:18.1493936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1494104Z traceback.print_stack() 2022-11-23T03:12:18.1494271Z File "", line 1, in 2022-11-23T03:12:18.1494383Z File "", line 1, in 2022-11-23T03:12:18.1494672Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1494849Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1495089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1495275Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1495518Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1495724Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1495923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1496062Z self.run() 2022-11-23T03:12:18.1496307Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1496499Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1496780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1496968Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1497216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1497352Z self.run() 2022-11-23T03:12:18.1497680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1497883Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1498135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1498351Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1498748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1498908Z getattr(self, test_name)() 2022-11-23T03:12:18.1499344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1499514Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1499856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1499999Z fn() 2022-11-23T03:12:18.1500434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1500592Z getattr(self, test_name)() 2022-11-23T03:12:18.1500993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1501151Z test(self, **param_kwargs) 2022-11-23T03:12:18.1501547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1501626Z fn() 2022-11-23T03:12:18.1502019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1502195Z return func(*args, **kwargs) 2022-11-23T03:12:18.1502601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1502793Z test(self, **param_kwargs) 2022-11-23T03:12:18.1503113Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1503261Z self.run_subtests( 2022-11-23T03:12:18.1503660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1503767Z return func(*args, **kwargs) 2022-11-23T03:12:18.1504800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1505171Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1505699Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1505969Z self.run_subtests( 2022-11-23T03:12:18.1506578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1506774Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1507170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1507313Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1507726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1507892Z output = model(*input) 2022-11-23T03:12:18.1508291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1508476Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1508927Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1509187Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1509605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1509707Z output = model(*input) 2022-11-23T03:12:18.1510120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1510341Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1510704Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1510881Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1511365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1511594Z _lazy_init(state, module) 2022-11-23T03:12:18.1512028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1512221Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1512546Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1512706Z return func(*args, **kwargs) 2022-11-23T03:12:18.1513119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1513256Z p_assert( 2022-11-23T03:12:18.1513629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1513791Z traceback.print_stack() 2022-11-23T03:12:18.1514235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1514493Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1514851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1515007Z _lazy_init(state, module) 2022-11-23T03:12:18.1515395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1515572Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1515947Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1516107Z return func(*args, **kwargs) 2022-11-23T03:12:18.1516522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1516667Z p_assert( 2022-11-23T03:12:18.1516989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1517192Z traceback.print_stack() 2022-11-23T03:12:18.1517470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1517741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1518002Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1518269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1518434Z File "", line 1, in 2022-11-23T03:12:18.1518628Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1518814Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1519006Z File "", line 1, in 2022-11-23T03:12:18.1519277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1519468Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1519629Z File "", line 1, in 2022-11-23T03:12:18.1519919Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1520101Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1520295Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1520442Z self.run() 2022-11-23T03:12:18.1520678Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1520860Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1521142Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1521317Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1521554Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1521684Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1521990Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1522132Z self.run() 2022-11-23T03:12:18.1522397Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1522578Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1522962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1523162Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1523401Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1523529Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1523790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1523924Z self.run() 2022-11-23T03:12:18.1524328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1524493Z getattr(self, test_name)() 2022-11-23T03:12:18.1524868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1525035Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1525305Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1525436Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1525845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1525977Z fn() 2022-11-23T03:12:18.1526374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1526532Z getattr(self, test_name)() 2022-11-23T03:12:18.1526942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1527109Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1527464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1527623Z test(self, **param_kwargs) 2022-11-23T03:12:18.1528055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1528190Z fn() 2022-11-23T03:12:18.1528585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1528741Z getattr(self, test_name)() 2022-11-23T03:12:18.1529133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1529292Z return func(*args, **kwargs) 2022-11-23T03:12:18.1529644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1529804Z test(self, **param_kwargs) 2022-11-23T03:12:18.1530259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1530431Z fn() 2022-11-23T03:12:18.1530718Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1530865Z self.run_subtests( 2022-11-23T03:12:18.1531255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1531443Z return func(*args, **kwargs) 2022-11-23T03:12:18.1531794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1531950Z test(self, **param_kwargs) 2022-11-23T03:12:18.1532347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1532541Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1532914Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1533067Z self.run_subtests( 2022-11-23T03:12:18.1533469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1533627Z return func(*args, **kwargs) 2022-11-23T03:12:18.1533964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1534172Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1534336Z File "", line 1, in 2022-11-23T03:12:18.1534738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1534923Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1535240Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1535515Z self.run_subtests( 2022-11-23T03:12:18.1535929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1536064Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1536497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1536727Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1537144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1537302Z output = model(*input) 2022-11-23T03:12:18.1537549Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1537763Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1538182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1538290Z output = model(*input) 2022-11-23T03:12:18.1538704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1538892Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1539256Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1539434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1539671Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1539855Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1540246Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1540370Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1540792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1541025Z output = model(*input) 2022-11-23T03:12:18.1541451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1541663Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1541913Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1542051Z self.run() 2022-11-23T03:12:18.1542453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1542558Z _lazy_init(state, module) 2022-11-23T03:12:18.1543015Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1543226Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1543637Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1543816Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1544452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1544771Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1545524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1545765Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1546346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1546553Z _lazy_init(state, module) 2022-11-23T03:12:18.1547007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1547217Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1547621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1547799Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1548186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1548303Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1548675Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1548833Z return func(*args, **kwargs) 2022-11-23T03:12:18.1549271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1549429Z _lazy_init(state, module) 2022-11-23T03:12:18.1549798Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1549956Z return func(*args, **kwargs) 2022-11-23T03:12:18.1550370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1550477Z getattr(self, test_name)() 2022-11-23T03:12:18.1550898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1551037Z p_assert( 2022-11-23T03:12:18.1551450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1551654Z p_assert( 2022-11-23T03:12:18.1552041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1552218Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1552621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1552703Z fn() 2022-11-23T03:12:18.1553163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1553335Z traceback.print_stack() 2022-11-23T03:12:18.1553709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1553869Z traceback.print_stack() 2022-11-23T03:12:18.1554280Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1554440Z return func(*args, **kwargs) 2022-11-23T03:12:18.1554853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1554958Z test(self, **param_kwargs) 2022-11-23T03:12:18.1555367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1555503Z p_assert( 2022-11-23T03:12:18.1555900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1556128Z return func(*args, **kwargs) 2022-11-23T03:12:18.1556623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1556829Z traceback.print_stack() 2022-11-23T03:12:18.1557063Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1557212Z self.run_subtests( 2022-11-23T03:12:18.1557603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1557799Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1558198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1558387Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1558806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1558971Z output = model(*input) 2022-11-23T03:12:18.1559366Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1559614Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1560024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1560237Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1560639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1560794Z _lazy_init(state, module) 2022-11-23T03:12:18.1561184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1561407Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1561786Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1561896Z return func(*args, **kwargs) 2022-11-23T03:12:18.1562353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1562491Z p_assert( 2022-11-23T03:12:18.1562864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1563027Z traceback.print_stack() 2022-11-23T03:12:18.1563305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1563584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1563847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1564066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1564289Z File "", line 1, in 2022-11-23T03:12:18.1564576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1564756Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1565232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1565640Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1566502Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1566910Z self.run() 2022-11-23T03:12:18.1567425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1567833Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1568469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1568973Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1569750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1570225Z getattr(self, test_name)() 2022-11-23T03:12:18.1570756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1571262Z fn() 2022-11-23T03:12:18.1571825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1572304Z test(self, **param_kwargs) 2022-11-23T03:12:18.1572830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1573327Z return func(*args, **kwargs) 2022-11-23T03:12:18.1573804Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1574208Z self.run_subtests( 2022-11-23T03:12:18.1574817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1575317Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1575920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1576438Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1577079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1577542Z output = model(*input) 2022-11-23T03:12:18.1578034Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1578531Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1579167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1579726Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1580316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1580831Z _lazy_init(state, module) 2022-11-23T03:12:18.1581200Z File "", line 1, in 2022-11-23T03:12:18.1581795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1582278Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1583125Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1583592Z return func(*args, **kwargs) 2022-11-23T03:12:18.1584471Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1585285Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1586319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1586954Z p_assert( 2022-11-23T03:12:18.1587373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1587768Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1588379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1588882Z traceback.print_stack() 2022-11-23T03:12:18.1589318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1589675Z self.run() 2022-11-23T03:12:18.1590083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1590595Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1591132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1591668Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1592280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1592689Z getattr(self, test_name)() 2022-11-23T03:12:18.1593345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1593798Z fn() 2022-11-23T03:12:18.1594616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1595031Z test(self, **param_kwargs) 2022-11-23T03:12:18.1595652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1596172Z return func(*args, **kwargs) 2022-11-23T03:12:18.1596681Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1599806Z self.run_subtests( 2022-11-23T03:12:18.1600423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1600989Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1601618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1602054Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1602708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1603182Z output = model(*input) 2022-11-23T03:12:18.1603715Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1604173Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1604822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1605439Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1606101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1606512Z _lazy_init(state, module) 2022-11-23T03:12:18.1607080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1607644Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1608176Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1608620Z return func(*args, **kwargs) 2022-11-23T03:12:18.1609219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1609616Z p_assert( 2022-11-23T03:12:18.1610195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1610657Z traceback.print_stack() 2022-11-23T03:12:18.1611141Z File "", line 1, in 2022-11-23T03:12:18.1611458Z File "", line 1, in 2022-11-23T03:12:18.1611994Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1612436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1612828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1613282Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1613810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1614260Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1614701Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1615154Z self.run() 2022-11-23T03:12:18.1615625Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1616087Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1616580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1617068Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1617529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1617893Z self.run() 2022-11-23T03:12:18.1618457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1618957Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1619431Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1619853Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1620506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1620986Z getattr(self, test_name)() 2022-11-23T03:12:18.1621511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1622004Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1622604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1623054Z fn() 2022-11-23T03:12:18.1623558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1624651Z getattr(self, test_name)() 2022-11-23T03:12:18.1625823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1626531Z test(self, **param_kwargs) 2022-11-23T03:12:18.1627182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1627618Z fn() 2022-11-23T03:12:18.1628127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1628645Z return func(*args, **kwargs) 2022-11-23T03:12:18.1629341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1629811Z test(self, **param_kwargs) 2022-11-23T03:12:18.1630235Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1630716Z self.run_subtests( 2022-11-23T03:12:18.1631350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1631763Z return func(*args, **kwargs) 2022-11-23T03:12:18.1632345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1632844Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1633495Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1633907Z self.run_subtests( 2022-11-23T03:12:18.1634499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1634994Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1635556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1636135Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1636769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1637239Z output = model(*input) 2022-11-23T03:12:18.1637785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1638398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1638982Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1639396Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1640027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1640516Z output = model(*input) 2022-11-23T03:12:18.1641100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1641671Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1780471Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1780875Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1781400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1781805Z _lazy_init(state, module) 2022-11-23T03:12:18.1782321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1782748Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1783280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1783659Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1784910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1785627Z _lazy_init(state, module) 2022-11-23T03:12:18.1786506Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1786873Z return func(*args, **kwargs) 2022-11-23T03:12:18.1787357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1787740Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1788261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1788611Z p_assert( 2022-11-23T03:12:18.1789055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1789408Z return func(*args, **kwargs) 2022-11-23T03:12:18.1789873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1790225Z traceback.print_stack() 2022-11-23T03:12:18.1790733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1791089Z p_assert( 2022-11-23T03:12:18.1791524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1792196Z traceback.print_stack() 2022-11-23T03:12:18.1792583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1793038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1793497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1793953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1794302Z File "", line 1, in 2022-11-23T03:12:18.1794646Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1794994Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1795339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1795678Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1796137Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1796447Z self.run() 2022-11-23T03:12:18.1796750Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1797092Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1797586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1797949Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1798442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1798810Z getattr(self, test_name)() 2022-11-23T03:12:18.1799294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1799628Z fn() 2022-11-23T03:12:18.1800089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1800465Z test(self, **param_kwargs) 2022-11-23T03:12:18.1800948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1801306Z return func(*args, **kwargs) 2022-11-23T03:12:18.1801683Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1802029Z self.run_subtests( 2022-11-23T03:12:18.1802492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1802889Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1803412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1803807Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1804336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1804705Z output = model(*input) 2022-11-23T03:12:18.1805154Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1805513Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1806032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1806463Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1807001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1807367Z _lazy_init(state, module) 2022-11-23T03:12:18.1807841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1808220Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1808762Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1809127Z return func(*args, **kwargs) 2022-11-23T03:12:18.1809636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1809993Z p_assert( 2022-11-23T03:12:18.1810431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1810784Z traceback.print_stack() 2022-11-23T03:12:18.1811042Z File "", line 1, in 2022-11-23T03:12:18.1811448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1811794Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1812138Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1812478Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1812904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1813212Z self.run() 2022-11-23T03:12:18.1813512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1813850Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1814339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1814698Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1815190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1815558Z getattr(self, test_name)() 2022-11-23T03:12:18.1816045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1816379Z fn() 2022-11-23T03:12:18.1816840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1817219Z test(self, **param_kwargs) 2022-11-23T03:12:18.1817706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1818063Z return func(*args, **kwargs) 2022-11-23T03:12:18.1818439Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1818786Z self.run_subtests( 2022-11-23T03:12:18.1819249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1819647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1820167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1820562Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1821093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1821464Z output = model(*input) 2022-11-23T03:12:18.1821915Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1822269Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1822787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1823231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1823767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1824723Z _lazy_init(state, module) 2022-11-23T03:12:18.1825669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1826306Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1826892Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1827253Z return func(*args, **kwargs) 2022-11-23T03:12:18.1827762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1828118Z p_assert( 2022-11-23T03:12:18.1828553Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1828911Z traceback.print_stack() 2022-11-23T03:12:18.1829194Z File "", line 1, in 2022-11-23T03:12:18.1829477Z File "", line 1, in 2022-11-23T03:12:18.1829825Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1830194Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1830560Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1830990Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1831371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1831737Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1832097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1832430Z self.run() 2022-11-23T03:12:18.1832757Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1833123Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1833474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1833836Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1834225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1834541Z self.run() 2022-11-23T03:12:18.1835021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1835526Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1835877Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1836237Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1836773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1837171Z getattr(self, test_name)() 2022-11-23T03:12:18.1837643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1838022Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1838535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1838882Z fn() 2022-11-23T03:12:18.1839357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1839753Z getattr(self, test_name)() 2022-11-23T03:12:18.1840266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1840642Z test(self, **param_kwargs) 2022-11-23T03:12:18.1841156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1841512Z fn() 2022-11-23T03:12:18.1841970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1842353Z return func(*args, **kwargs) 2022-11-23T03:12:18.1842864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1843240Z test(self, **param_kwargs) 2022-11-23T03:12:18.1843641Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1844017Z self.run_subtests( 2022-11-23T03:12:18.1844596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1844979Z return func(*args, **kwargs) 2022-11-23T03:12:18.1845486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1845911Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1846321Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1846694Z self.run_subtests( 2022-11-23T03:12:18.1847196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1847619Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1848124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1848603Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1849159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1849536Z output = model(*input) 2022-11-23T03:12:18.1850053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1850464Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1850958Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1851325Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1851861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1852250Z output = model(*input) 2022-11-23T03:12:18.1852763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1853217Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1853739Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1854123Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1854633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1855027Z _lazy_init(state, module) 2022-11-23T03:12:18.1855552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1855986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1856533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1856941Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1857493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1857867Z _lazy_init(state, module) 2022-11-23T03:12:18.1858354Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1858732Z return func(*args, **kwargs) 2022-11-23T03:12:18.1859217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1859616Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1860157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1860535Z p_assert( 2022-11-23T03:12:18.1860983Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1861361Z return func(*args, **kwargs) 2022-11-23T03:12:18.1861893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1862261Z traceback.print_stack() 2022-11-23T03:12:18.1862798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1863173Z p_assert( 2022-11-23T03:12:18.1863615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1864408Z traceback.print_stack() 2022-11-23T03:12:18.1865141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1866078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1866569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1867138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1867520Z File "", line 1, in 2022-11-23T03:12:18.1867786Z File "", line 1, in 2022-11-23T03:12:18.1868154Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1868525Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1868929Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1869296Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1869664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1870031Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1870388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1870747Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1871127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1871452Z self.run() 2022-11-23T03:12:18.1871794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1872127Z self.run() 2022-11-23T03:12:18.1872438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1872797Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1873161Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1873520Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1873798Z File "", line 1, in 2022-11-23T03:12:18.1874318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1874703Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1875181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1875578Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1875956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1876311Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1876847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1877238Z getattr(self, test_name)() 2022-11-23T03:12:18.1877591Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1877945Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1878479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1878869Z getattr(self, test_name)() 2022-11-23T03:12:18.1879361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1879730Z fn() 2022-11-23T03:12:18.1880278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1880658Z fn() 2022-11-23T03:12:18.1880981Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1881317Z self.run() 2022-11-23T03:12:18.1881817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1882190Z test(self, **param_kwargs) 2022-11-23T03:12:18.1882542Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1882907Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1883421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1883810Z test(self, **param_kwargs) 2022-11-23T03:12:18.1884087Z File "", line 1, in 2022-11-23T03:12:18.1884690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1885064Z return func(*args, **kwargs) 2022-11-23T03:12:18.1885557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1885941Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1886439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1886829Z return func(*args, **kwargs) 2022-11-23T03:12:18.1887182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1887529Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1887941Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1888312Z self.run_subtests( 2022-11-23T03:12:18.1888825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1889197Z getattr(self, test_name)() 2022-11-23T03:12:18.1889597Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1889964Z self.run_subtests( 2022-11-23T03:12:18.1890289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1890655Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1891186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1891604Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1892130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1892493Z fn() 2022-11-23T03:12:18.1892978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1893386Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1893908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1894328Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1894731Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1895050Z self.run() 2022-11-23T03:12:18.1895546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1895935Z test(self, **param_kwargs) 2022-11-23T03:12:18.1896444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1896842Z output = model(*input) 2022-11-23T03:12:18.1897405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1897781Z return func(*args, **kwargs) 2022-11-23T03:12:18.1898294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1898711Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1899089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1899437Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1899926Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1900307Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1900700Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1901069Z self.run_subtests( 2022-11-23T03:12:18.1901645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1902039Z output = model(*input) 2022-11-23T03:12:18.1902506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1902891Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1903437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1904101Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1905170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1905964Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1906624Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1907004Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1907531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1907922Z getattr(self, test_name)() 2022-11-23T03:12:18.1908436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1909010Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1909574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1909942Z fn() 2022-11-23T03:12:18.1910412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1910816Z _lazy_init(state, module) 2022-11-23T03:12:18.1911394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1911805Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1912359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1912798Z _lazy_init(state, module) 2022-11-23T03:12:18.1913323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1913703Z test(self, **param_kwargs) 2022-11-23T03:12:18.1914221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1914634Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1915163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1915563Z output = model(*input) 2022-11-23T03:12:18.1916068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1916579Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1917099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1917497Z return func(*args, **kwargs) 2022-11-23T03:12:18.1917991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1918353Z return func(*args, **kwargs) 2022-11-23T03:12:18.1918831Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1919214Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1919764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1920132Z p_assert( 2022-11-23T03:12:18.1920596Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1921048Z return func(*args, **kwargs) 2022-11-23T03:12:18.1921435Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1921812Z self.run_subtests( 2022-11-23T03:12:18.1922330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1922769Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1923302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1923684Z traceback.print_stack() 2022-11-23T03:12:18.1924255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1924621Z p_assert( 2022-11-23T03:12:18.1925097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1925545Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1926079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1926480Z _lazy_init(state, module) 2022-11-23T03:12:18.1926969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1927346Z traceback.print_stack() 2022-11-23T03:12:18.1927848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1928267Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1928796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1929183Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1929734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1930138Z output = model(*input) 2022-11-23T03:12:18.1930629Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1930994Z return func(*args, **kwargs) 2022-11-23T03:12:18.1931469Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1931860Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1932385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1932774Z p_assert( 2022-11-23T03:12:18.1933290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1933744Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1934314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1934710Z traceback.print_stack() 2022-11-23T03:12:18.1935230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1935610Z _lazy_init(state, module) 2022-11-23T03:12:18.1936113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1936531Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1937045Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1937410Z return func(*args, **kwargs) 2022-11-23T03:12:18.1937944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1938328Z p_assert( 2022-11-23T03:12:18.1938831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1939213Z traceback.print_stack() 2022-11-23T03:12:18.1939613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1940111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1940574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1941059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1941191Z File "", line 1, in 2022-11-23T03:12:18.1941403Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1941553Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1941739Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1941894Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1942114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1942220Z self.run() 2022-11-23T03:12:18.1942423Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1942570Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1942915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1943049Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1943396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1943526Z getattr(self, test_name)() 2022-11-23T03:12:18.1944192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1944374Z fn() 2022-11-23T03:12:18.1945087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1945329Z test(self, **param_kwargs) 2022-11-23T03:12:18.1946030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1946257Z return func(*args, **kwargs) 2022-11-23T03:12:18.1946495Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1946613Z self.run_subtests( 2022-11-23T03:12:18.1946972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1947140Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1947501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1947657Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1948124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1948260Z output = model(*input) 2022-11-23T03:12:18.1948571Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1948716Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1949095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1949272Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1949644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1949768Z _lazy_init(state, module) 2022-11-23T03:12:18.1950119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1950327Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1950652Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1950784Z return func(*args, **kwargs) 2022-11-23T03:12:18.1951168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1951271Z p_assert( 2022-11-23T03:12:18.1951608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1951742Z traceback.print_stack() 2022-11-23T03:12:18.1951871Z File "", line 1, in 2022-11-23T03:12:18.1952000Z File "", line 1, in 2022-11-23T03:12:18.1952193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1952336Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1952466Z File "", line 1, in 2022-11-23T03:12:18.1952682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1952829Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1953033Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1953184Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1953364Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1953519Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1953729Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1953870Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1954085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1954189Z self.run() 2022-11-23T03:12:18.1954394Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1954506Z self.run() 2022-11-23T03:12:18.1954689Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1954840Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1955041Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1955187Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1955386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1955531Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1955748Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1955852Z self.run() 2022-11-23T03:12:18.1956176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1956317Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1956654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1956842Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1957054Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1957199Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1957569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1957676Z getattr(self, test_name)() 2022-11-23T03:12:18.1958041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1958164Z getattr(self, test_name)() 2022-11-23T03:12:18.1958504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1958639Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1958992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1959145Z fn() 2022-11-23T03:12:18.1959512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1959590Z fn() 2022-11-23T03:12:18.1959955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1960079Z getattr(self, test_name)() 2022-11-23T03:12:18.1960442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1960564Z test(self, **param_kwargs) 2022-11-23T03:12:18.1960920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1961046Z test(self, **param_kwargs) 2022-11-23T03:12:18.1961398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1961479Z fn() 2022-11-23T03:12:18.1961846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1961973Z return func(*args, **kwargs) 2022-11-23T03:12:18.1962326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1962451Z return func(*args, **kwargs) 2022-11-23T03:12:18.1962815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1962940Z test(self, **param_kwargs) 2022-11-23T03:12:18.1963194Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1963288Z self.run_subtests( 2022-11-23T03:12:18.1963530Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1963650Z self.run_subtests( 2022-11-23T03:12:18.1964011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1964143Z return func(*args, **kwargs) 2022-11-23T03:12:18.1964497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1964667Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1965013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1965155Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1965410Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1965522Z self.run_subtests( 2022-11-23T03:12:18.1965891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1966090Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1966457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1966609Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1966964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1967105Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1967478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1967605Z output = model(*input) 2022-11-23T03:12:18.1967986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1968103Z output = model(*input) 2022-11-23T03:12:18.1968597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1968751Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1969132Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1969280Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1969587Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1969727Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1970099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1970221Z output = model(*input) 2022-11-23T03:12:18.1970603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1970779Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1971171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1971345Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1971650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1971795Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1972169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1972293Z _lazy_init(state, module) 2022-11-23T03:12:18.1972661Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1972779Z _lazy_init(state, module) 2022-11-23T03:12:18.1973146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1973325Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1973657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1973811Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1974155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1974291Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1974656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1974774Z _lazy_init(state, module) 2022-11-23T03:12:18.1975106Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1975235Z return func(*args, **kwargs) 2022-11-23T03:12:18.1975551Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1975732Z return func(*args, **kwargs) 2022-11-23T03:12:18.1976098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1976240Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1976620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1976724Z p_assert( 2022-11-23T03:12:18.1977105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1977206Z p_assert( 2022-11-23T03:12:18.1977521Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.1977657Z return func(*args, **kwargs) 2022-11-23T03:12:18.1977993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1978184Z traceback.print_stack() 2022-11-23T03:12:18.1978522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1978649Z traceback.print_stack() 2022-11-23T03:12:18.1979032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.1979132Z p_assert( 2022-11-23T03:12:18.1979441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.1979571Z traceback.print_stack() 2022-11-23T03:12:18.1979810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1980045Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1980277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1980522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.1980654Z File "", line 1, in 2022-11-23T03:12:18.1980864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1980987Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1981190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1981346Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1981557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1981660Z self.run() 2022-11-23T03:12:18.1981866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1982014Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1982359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1982477Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1982839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1982963Z getattr(self, test_name)() 2022-11-23T03:12:18.1983329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1983426Z fn() 2022-11-23T03:12:18.1983555Z File "", line 1, in 2022-11-23T03:12:18.1984235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1984464Z test(self, **param_kwargs) 2022-11-23T03:12:18.1984667Z File "", line 1, in 2022-11-23T03:12:18.1985056Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1985317Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1986122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1986357Z return func(*args, **kwargs) 2022-11-23T03:12:18.1986681Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1986828Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1987011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1987170Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1987418Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1987533Z self.run_subtests( 2022-11-23T03:12:18.1987741Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1987890Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1988102Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1988275Z self.run() 2022-11-23T03:12:18.1988621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.1988790Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.1989002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1989112Z self.run() 2022-11-23T03:12:18.1989320Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1989468Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1989838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.1989993Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.1990179Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1990328Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1990671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1990810Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1991186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.1991310Z output = model(*input) 2022-11-23T03:12:18.1991440Z File "", line 1, in 2022-11-23T03:12:18.1991774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.1991888Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.1992252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1992376Z getattr(self, test_name)() 2022-11-23T03:12:18.1992703Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.1992857Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.1993213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.1993342Z getattr(self, test_name)() 2022-11-23T03:12:18.1993555Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.1993679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.1994036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1994135Z fn() 2022-11-23T03:12:18.1994513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.1994690Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.1995053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.1995155Z fn() 2022-11-23T03:12:18.1995421Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.1995560Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.1995933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1996059Z test(self, **param_kwargs) 2022-11-23T03:12:18.1996424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.1996546Z _lazy_init(state, module) 2022-11-23T03:12:18.1996915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.1997039Z test(self, **param_kwargs) 2022-11-23T03:12:18.1997256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.1997391Z self.run() 2022-11-23T03:12:18.1997753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1997883Z return func(*args, **kwargs) 2022-11-23T03:12:18.1998238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.1998381Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.1998741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.1998867Z return func(*args, **kwargs) 2022-11-23T03:12:18.1999073Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.1999201Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.1999452Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.1999571Z self.run_subtests( 2022-11-23T03:12:18.1999914Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2000047Z return func(*args, **kwargs) 2022-11-23T03:12:18.2000298Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2000411Z self.run_subtests( 2022-11-23T03:12:18.2000730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2000864Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2001220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2001384Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2001766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2001873Z p_assert( 2022-11-23T03:12:18.2002247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2002400Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2002753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2002896Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2003250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2003376Z getattr(self, test_name)() 2022-11-23T03:12:18.2003732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2003882Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2004219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2004347Z traceback.print_stack() 2022-11-23T03:12:18.2004768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2004874Z output = model(*input) 2022-11-23T03:12:18.2005247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2005370Z output = model(*input) 2022-11-23T03:12:18.2005723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2005827Z fn() 2022-11-23T03:12:18.2006152Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2006291Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2006615Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2006782Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2007155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2007278Z test(self, **param_kwargs) 2022-11-23T03:12:18.2007656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2007833Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2008212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2008384Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2008745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2008850Z return func(*args, **kwargs) 2022-11-23T03:12:18.2009215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2009344Z _lazy_init(state, module) 2022-11-23T03:12:18.2009704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2009823Z _lazy_init(state, module) 2022-11-23T03:12:18.2010073Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2010185Z self.run_subtests( 2022-11-23T03:12:18.2010536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2010661Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2011010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2011147Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2011572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2011737Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2012070Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2012200Z return func(*args, **kwargs) 2022-11-23T03:12:18.2012535Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2012641Z return func(*args, **kwargs) 2022-11-23T03:12:18.2013002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2013160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2013534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2013635Z p_assert( 2022-11-23T03:12:18.2014068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2014174Z p_assert( 2022-11-23T03:12:18.2014556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2014673Z output = model(*input) 2022-11-23T03:12:18.2014992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2015117Z traceback.print_stack() 2022-11-23T03:12:18.2015455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2015579Z traceback.print_stack() 2022-11-23T03:12:18.2015909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2016048Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2016479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2016635Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2016998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2017118Z _lazy_init(state, module) 2022-11-23T03:12:18.2017474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2017615Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2017956Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2018076Z return func(*args, **kwargs) 2022-11-23T03:12:18.2018458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2018561Z p_assert( 2022-11-23T03:12:18.2018885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2019010Z traceback.print_stack() 2022-11-23T03:12:18.2019245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2019487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2019719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2019954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2020085Z File "", line 1, in 2022-11-23T03:12:18.2020194Z File "", line 1, in 2022-11-23T03:12:18.2020403Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2020548Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2020678Z File "", line 1, in 2022-11-23T03:12:18.2020888Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2021028Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2021225Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2021380Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2021489Z File "", line 1, in 2022-11-23T03:12:18.2021695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2021837Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2022042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2022189Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2022397Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2022499Z self.run() 2022-11-23T03:12:18.2022690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2022881Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2023094Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2023237Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2023440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2023588Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2023797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2024159Z self.run() 2022-11-23T03:12:18.2024516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2024784Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2025172Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2025470Z self.run() 2022-11-23T03:12:18.2025851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2026111Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2026339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2026481Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2026817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2026955Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2027160Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2027261Z self.run() 2022-11-23T03:12:18.2027600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2027729Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2028061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2028180Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2028540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2028665Z getattr(self, test_name)() 2022-11-23T03:12:18.2028864Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2029007Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2029364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2029485Z getattr(self, test_name)() 2022-11-23T03:12:18.2029839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2029943Z getattr(self, test_name)() 2022-11-23T03:12:18.2030305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2030405Z fn() 2022-11-23T03:12:18.2030748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2030879Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2031239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2031359Z test(self, **param_kwargs) 2022-11-23T03:12:18.2031715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2031794Z fn() 2022-11-23T03:12:18.2032144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2032237Z fn() 2022-11-23T03:12:18.2032596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2032721Z getattr(self, test_name)() 2022-11-23T03:12:18.2033138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2033276Z return func(*args, **kwargs) 2022-11-23T03:12:18.2033644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2033749Z test(self, **param_kwargs) 2022-11-23T03:12:18.2034101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2034225Z test(self, **param_kwargs) 2022-11-23T03:12:18.2034578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2034702Z return func(*args, **kwargs) 2022-11-23T03:12:18.2035054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2035199Z fn() 2022-11-23T03:12:18.2035456Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2035551Z self.run_subtests( 2022-11-23T03:12:18.2035903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2036030Z return func(*args, **kwargs) 2022-11-23T03:12:18.2036276Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2036389Z self.run_subtests( 2022-11-23T03:12:18.2036755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2036875Z test(self, **param_kwargs) 2022-11-23T03:12:18.2037219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2037366Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2037613Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2037738Z self.run_subtests( 2022-11-23T03:12:18.2038086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2038245Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2038598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2038755Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2039119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2039252Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2039611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2039741Z return func(*args, **kwargs) 2022-11-23T03:12:18.2040105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2040256Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2040616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2040762Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2041139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2041240Z output = model(*input) 2022-11-23T03:12:18.2041488Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T03:12:18.2041607Z self.run_subtests( 2022-11-23T03:12:18.2041978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2042148Z output = model(*input) 2022-11-23T03:12:18.2042531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2042652Z output = model(*input) 2022-11-23T03:12:18.2042975Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2043098Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2043427Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2043564Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2043917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2044076Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2044455Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2044594Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2044976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2045151Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2045507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2045686Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2046049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2046199Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2046573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2046754Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2047118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2047238Z _lazy_init(state, module) 2022-11-23T03:12:18.2047579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2047702Z _lazy_init(state, module) 2022-11-23T03:12:18.2048050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2048197Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2048568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2048685Z output = model(*input) 2022-11-23T03:12:18.2049025Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2049154Z return func(*args, **kwargs) 2022-11-23T03:12:18.2049499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2049618Z _lazy_init(state, module) 2022-11-23T03:12:18.2049970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2050111Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2050452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2050592Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2050913Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2051052Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2051460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2051573Z p_assert( 2022-11-23T03:12:18.2051908Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2052036Z return func(*args, **kwargs) 2022-11-23T03:12:18.2052369Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2052490Z return func(*args, **kwargs) 2022-11-23T03:12:18.2052870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2052972Z p_assert( 2022-11-23T03:12:18.2053326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2053502Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2053898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2054022Z traceback.print_stack() 2022-11-23T03:12:18.2054392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2054511Z _lazy_init(state, module) 2022-11-23T03:12:18.2054887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2054986Z p_assert( 2022-11-23T03:12:18.2055301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2055429Z traceback.print_stack() 2022-11-23T03:12:18.2055773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2055914Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2056243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2056379Z traceback.print_stack() 2022-11-23T03:12:18.2056712Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2056835Z return func(*args, **kwargs) 2022-11-23T03:12:18.2057190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2057296Z p_assert( 2022-11-23T03:12:18.2057626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2057755Z traceback.print_stack() 2022-11-23T03:12:18.2058001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2058237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2058475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2058709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2058961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2059219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2059446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2059676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2059900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2060124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2060353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2060580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2060866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2061080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2061305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2061536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2061763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2061986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2062211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2062439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2062710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2062912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2063136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2063355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2065177Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2065626Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.2066846Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2067070Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.2068040Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2068267Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.2069285Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2069505Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T03:12:18.2069738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2069968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2070279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2070517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2070730Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2070954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2071179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2071403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2071625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2071847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2072068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2072353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2072561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2072784Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2073009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2073232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2073456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2073676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2073898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2074119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2074333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2074555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2074775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2074995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2075215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2075434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2075655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2075874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2076098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2076306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2076528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2076748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2076968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2077188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2077412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2077632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2077854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2078107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2078339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2078561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2078784Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2079004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2079224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2079443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2079663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2079882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2080136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2080359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2080581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2080801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2081023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2081242Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2081462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2081680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2081886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2082113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2082334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2082554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2082777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2083004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2083225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2083445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2083652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2083884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2084107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2084332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2084560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2084780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2085007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2085231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2085460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2085667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2085789Z dist init r=1, world=4 2022-11-23T03:12:18.2086166Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2086493Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2086800Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2087106Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2087407Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2087759Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2088059Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2088358Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2088655Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2088936Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2089236Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2089539Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2089651Z dist init r=3, world=4 2022-11-23T03:12:18.2089974Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2090286Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2090592Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2090898Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2091203Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2091502Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2091804Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2092084Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2092385Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2092749Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2093056Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2093359Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2093468Z dist init r=0, world=4 2022-11-23T03:12:18.2093790Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2094103Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2094460Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2094765Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2095067Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2095351Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2095974Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2096295Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2096594Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2096895Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2097191Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2097488Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2097598Z dist init r=2, world=4 2022-11-23T03:12:18.2097927Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2098239Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2098546Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2098849Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2099135Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2099437Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2099795Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2100104Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2100403Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2100701Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2101001Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2101347Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2101450Z ok (36.286s) 2022-11-23T03:12:18.2101665Z test_delayed_reduce_scatter_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2101975Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12298 2022-11-23T03:12:18.2102179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12299 2022-11-23T03:12:18.2102395Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12300 2022-11-23T03:12:18.2102608Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12301 2022-11-23T03:12:18.2102995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2103177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2103561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2103750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2104547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2104841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2105565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2105917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2106470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2106740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2107127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2107315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2107677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2107850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2108209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2108395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2108640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.2108883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.2109244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.2109495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.2109898Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2110292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2110686Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2111052Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2111341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.2111640Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.2111875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.2112103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.2112340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2112575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2112808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2113019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2114054Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2114178Z warnings.warn( 2022-11-23T03:12:18.2115198Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2115307Z warnings.warn( 2022-11-23T03:12:18.2116314Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2116427Z warnings.warn( 2022-11-23T03:12:18.2117430Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2117537Z warnings.warn( 2022-11-23T03:12:18.2117774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2118059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2118299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2118530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2118761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2118969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2119199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2119426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2119650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2119874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2120151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2120377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2120601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2120806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2121037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2121263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2121487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2121710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2121936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2122166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2122388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2122592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2122816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2123038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2123260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2123484Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2123707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2123931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2124155Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2124378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2124584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2124804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2125027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2125249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2125473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2125693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2126500Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2127253Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2127992Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2128780Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2129010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2129240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2129453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2129680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2129905Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2130130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2130364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2130590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2130816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2131040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2131246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2131469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2131691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2131915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2132140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2132370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2132592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2132816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2133021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2133245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2133466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2133727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2133950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2134173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2134445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2134676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2134903Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2135106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2135327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2135658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2135883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2136105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2136378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2136605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2136826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2137030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2137253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2137476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2137700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2137920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2138143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2138373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2138595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2138815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2139019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2139240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2139462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2139682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2140432Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2141174Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2141906Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2142683Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2142925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2143156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2143383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2143609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2143817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2144499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2144935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2145359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2145676Z dist init r=1, world=4 2022-11-23T03:12:18.2145866Z dist init r=0, world=4 2022-11-23T03:12:18.2146054Z dist init r=3, world=4 2022-11-23T03:12:18.2146200Z dist init r=2, world=4 2022-11-23T03:12:18.2146304Z ok (6.122s) 2022-11-23T03:12:18.2146518Z test_delayed_reduce_scatter_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2147439Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82704 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:12:18.2147663Z test_delayed_reduce_scatter_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2148546Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82398 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:12:18.2148765Z test_delayed_reduce_scatter_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2149074Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12599 2022-11-23T03:12:18.2149292Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12600 2022-11-23T03:12:18.2149506Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12601 2022-11-23T03:12:18.2149723Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12602 2022-11-23T03:12:18.2150077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2150260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2150642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2150833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2151203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2151375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2151747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2151933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2152275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2152519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2152908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2153095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2153456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2153629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2154001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2154187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2154432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.2154658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.2154949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.2155349Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2155585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.2155980Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2156367Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2156748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2156975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.2157206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.2157413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.2157638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.2157871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2158104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2158329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2158557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2159577Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2159694Z warnings.warn( 2022-11-23T03:12:18.2160708Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2160817Z warnings.warn( 2022-11-23T03:12:18.2161862Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2161982Z warnings.warn( 2022-11-23T03:12:18.2162984Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2163075Z warnings.warn( 2022-11-23T03:12:18.2163204Z File "", line 1, in 2022-11-23T03:12:18.2163373Z File "", line 1, in 2022-11-23T03:12:18.2163590Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2163733Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2163943Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2164083Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2164284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2164416Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2164614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2164762Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2164974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2165076Z self.run() 2022-11-23T03:12:18.2165286Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2165389Z self.run() 2022-11-23T03:12:18.2165575Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2165721Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2165922Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2166063Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2166411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2166543Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2166879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2167008Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2167349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2167476Z getattr(self, test_name)() 2022-11-23T03:12:18.2167840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2167967Z getattr(self, test_name)() 2022-11-23T03:12:18.2168328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2168428Z fn() 2022-11-23T03:12:18.2168786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2168897Z fn() 2022-11-23T03:12:18.2169288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2169416Z test(self, **param_kwargs) 2022-11-23T03:12:18.2169775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2169902Z test(self, **param_kwargs) 2022-11-23T03:12:18.2170307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2170442Z return func(*args, **kwargs) 2022-11-23T03:12:18.2170803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2170932Z return func(*args, **kwargs) 2022-11-23T03:12:18.2171166Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2171283Z self.run_subtests( 2022-11-23T03:12:18.2171534Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2171650Z self.run_subtests( 2022-11-23T03:12:18.2172003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2172216Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2172575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2172738Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2173083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2173244Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2173605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2173755Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2174130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2174247Z output = model(*input) 2022-11-23T03:12:18.2174620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2174744Z output = model(*input) 2022-11-23T03:12:18.2174856Z File "", line 1, in 2022-11-23T03:12:18.2175182Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2175325Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2175649Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2175787Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2176161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2176337Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2176552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2176677Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2177062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2177239Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2177607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2177732Z _lazy_init(state, module) 2022-11-23T03:12:18.2177935Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2178089Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2178222Z File "", line 1, in 2022-11-23T03:12:18.2178570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2178693Z _lazy_init(state, module) 2022-11-23T03:12:18.2179051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2179265Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2179490Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2179598Z self.run() 2022-11-23T03:12:18.2179950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2180074Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2180278Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2180429Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2180640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2180785Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2181126Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2181304Z return func(*args, **kwargs) 2022-11-23T03:12:18.2181652Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2181760Z return func(*args, **kwargs) 2022-11-23T03:12:18.2182100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2182237Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2182625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2182731Z p_assert( 2022-11-23T03:12:18.2182936Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2183091Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2183430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2183539Z traceback.print_stack() 2022-11-23T03:12:18.2184263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2184462Z p_assert( 2022-11-23T03:12:18.2185163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2185392Z getattr(self, test_name)() 2022-11-23T03:12:18.2185794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2185978Z self.run() 2022-11-23T03:12:18.2186393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2186503Z traceback.print_stack() 2022-11-23T03:12:18.2186862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2186959Z fn() 2022-11-23T03:12:18.2187163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2187316Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2187684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2187811Z test(self, **param_kwargs) 2022-11-23T03:12:18.2188128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2188264Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2188622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2188751Z return func(*args, **kwargs) 2022-11-23T03:12:18.2189115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2189238Z getattr(self, test_name)() 2022-11-23T03:12:18.2189490Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2189611Z self.run_subtests( 2022-11-23T03:12:18.2190026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2190140Z fn() 2022-11-23T03:12:18.2190498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2190662Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2191033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2191158Z test(self, **param_kwargs) 2022-11-23T03:12:18.2191518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2191675Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2192011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2192207Z return func(*args, **kwargs) 2022-11-23T03:12:18.2192591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2192715Z output = model(*input) 2022-11-23T03:12:18.2192975Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2193091Z self.run_subtests( 2022-11-23T03:12:18.2193421Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2193564Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2193896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2194062Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2194438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2194631Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2194995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2195151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2195519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2195644Z _lazy_init(state, module) 2022-11-23T03:12:18.2196018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2196120Z output = model(*input) 2022-11-23T03:12:18.2196473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2196624Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2196955Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2197101Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2197443Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2197571Z return func(*args, **kwargs) 2022-11-23T03:12:18.2197927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2198104Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2198484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2198591Z p_assert( 2022-11-23T03:12:18.2198955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2199081Z _lazy_init(state, module) 2022-11-23T03:12:18.2199463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2199696Z traceback.print_stack() 2022-11-23T03:12:18.2200029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2200175Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2200512Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2200640Z return func(*args, **kwargs) 2022-11-23T03:12:18.2201021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2201127Z p_assert( 2022-11-23T03:12:18.2201464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2201639Z traceback.print_stack() 2022-11-23T03:12:18.2201860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2202103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2202336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2202687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2202821Z File "", line 1, in 2022-11-23T03:12:18.2203037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2203256Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2203463Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2203596Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2203812Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2203924Z self.run() 2022-11-23T03:12:18.2204133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2204281Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2204625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2204756Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2205117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2205224Z getattr(self, test_name)() 2022-11-23T03:12:18.2205587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2205684Z fn() 2022-11-23T03:12:18.2206051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2206206Z test(self, **param_kwargs) 2022-11-23T03:12:18.2206851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2206989Z return func(*args, **kwargs) 2022-11-23T03:12:18.2207244Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2207362Z self.run_subtests( 2022-11-23T03:12:18.2207728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2207872Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2208238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2208394Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2208770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2208895Z output = model(*input) 2022-11-23T03:12:18.2209263Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2209409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2209785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2209946Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2210312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2210430Z _lazy_init(state, module) 2022-11-23T03:12:18.2210780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2210920Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2211255Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2211489Z return func(*args, **kwargs) 2022-11-23T03:12:18.2211872Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2211957Z p_assert( 2022-11-23T03:12:18.2212290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2212415Z traceback.print_stack() 2022-11-23T03:12:18.2212545Z File "", line 1, in 2022-11-23T03:12:18.2212753Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2212892Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2213092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2213241Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2213434Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2213540Z self.run() 2022-11-23T03:12:18.2213746Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2213890Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2214230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2214361Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2214720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2214826Z getattr(self, test_name)() 2022-11-23T03:12:18.2215180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2215277Z fn() 2022-11-23T03:12:18.2215641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2215768Z test(self, **param_kwargs) 2022-11-23T03:12:18.2216125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2216248Z return func(*args, **kwargs) 2022-11-23T03:12:18.2216499Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2216593Z self.run_subtests( 2022-11-23T03:12:18.2216941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2217102Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2217464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2217616Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2217988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2218109Z output = model(*input) 2022-11-23T03:12:18.2218476Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2218606Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2218983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2219161Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2219527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2219764Z _lazy_init(state, module) 2022-11-23T03:12:18.2220113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2220256Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2220591Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2220744Z return func(*args, **kwargs) 2022-11-23T03:12:18.2221125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2221228Z p_assert( 2022-11-23T03:12:18.2221564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2221689Z traceback.print_stack() 2022-11-23T03:12:18.2221817Z File "", line 1, in 2022-11-23T03:12:18.2222024Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2222164Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2222347Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2222497Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2222709Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2222819Z self.run() 2022-11-23T03:12:18.2223020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2223164Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2223498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2223615Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2224369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2224600Z getattr(self, test_name)() 2022-11-23T03:12:18.2225286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2225461Z fn() 2022-11-23T03:12:18.2226158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2226297Z test(self, **param_kwargs) 2022-11-23T03:12:18.2226665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2226772Z return func(*args, **kwargs) 2022-11-23T03:12:18.2227027Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2227140Z self.run_subtests( 2022-11-23T03:12:18.2227491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2227652Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2228015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2228166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2228541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2228722Z output = model(*input) 2022-11-23T03:12:18.2229060Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2229199Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2229572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2229748Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2230115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2230236Z _lazy_init(state, module) 2022-11-23T03:12:18.2230585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2230725Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2231114Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2231238Z return func(*args, **kwargs) 2022-11-23T03:12:18.2231615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2231718Z p_assert( 2022-11-23T03:12:18.2232051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2232174Z traceback.print_stack() 2022-11-23T03:12:18.2232301Z File "", line 1, in 2022-11-23T03:12:18.2232491Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2232632Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2232831Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2232981Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2233197Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2233302Z self.run() 2022-11-23T03:12:18.2233503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2233650Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2233973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2234105Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2234461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2234585Z getattr(self, test_name)() 2022-11-23T03:12:18.2234940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2235037Z fn() 2022-11-23T03:12:18.2235396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2235522Z test(self, **param_kwargs) 2022-11-23T03:12:18.2235862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2235990Z return func(*args, **kwargs) 2022-11-23T03:12:18.2236246Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2236359Z self.run_subtests( 2022-11-23T03:12:18.2236707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2236868Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2237231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2237383Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2237737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2237904Z output = model(*input) 2022-11-23T03:12:18.2238237Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2238376Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2238747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2238922Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2239282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2239403Z _lazy_init(state, module) 2022-11-23T03:12:18.2239733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2239875Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2240264Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2240390Z return func(*args, **kwargs) 2022-11-23T03:12:18.2240770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2240871Z p_assert( 2022-11-23T03:12:18.2241205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2241330Z traceback.print_stack() 2022-11-23T03:12:18.2241550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2241785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2242014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2242245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2242380Z File "", line 1, in 2022-11-23T03:12:18.2242591Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2242733Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2242917Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2243070Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2243281Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2243385Z self.run() 2022-11-23T03:12:18.2243585Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2243730Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2244069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2244200Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2244551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2244675Z getattr(self, test_name)() 2022-11-23T03:12:18.2245034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2245130Z fn() 2022-11-23T03:12:18.2245493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2245615Z test(self, **param_kwargs) 2022-11-23T03:12:18.2245967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2246090Z return func(*args, **kwargs) 2022-11-23T03:12:18.2246323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2246436Z self.run_subtests( 2022-11-23T03:12:18.2246844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2247011Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2247373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2247524Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2247896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2248014Z output = model(*input) 2022-11-23T03:12:18.2248320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2248458Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2248832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2249101Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2249469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2249590Z _lazy_init(state, module) 2022-11-23T03:12:18.2249939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2250080Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2250399Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2250524Z return func(*args, **kwargs) 2022-11-23T03:12:18.2250897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2250998Z p_assert( 2022-11-23T03:12:18.2251333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2251463Z traceback.print_stack() 2022-11-23T03:12:18.2251593Z File "", line 1, in 2022-11-23T03:12:18.2251801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2251926Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2252125Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2252273Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2252483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2252585Z self.run() 2022-11-23T03:12:18.2252788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2252931Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2253252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2253387Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2253754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2253876Z getattr(self, test_name)() 2022-11-23T03:12:18.2254234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2254330Z fn() 2022-11-23T03:12:18.2254694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2254815Z test(self, **param_kwargs) 2022-11-23T03:12:18.2255152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2255275Z return func(*args, **kwargs) 2022-11-23T03:12:18.2255524Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2255637Z self.run_subtests( 2022-11-23T03:12:18.2256033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2256200Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2256564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2256715Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2257071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2257192Z output = model(*input) 2022-11-23T03:12:18.2257516Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2257659Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2258032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2258256Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2258623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2258743Z _lazy_init(state, module) 2022-11-23T03:12:18.2259073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2259216Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2259551Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2259676Z return func(*args, **kwargs) 2022-11-23T03:12:18.2260048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2260149Z p_assert( 2022-11-23T03:12:18.2260482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2260611Z traceback.print_stack() 2022-11-23T03:12:18.2260726Z File "", line 1, in 2022-11-23T03:12:18.2260936Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2261077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2261278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2261428Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2261639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2261742Z self.run() 2022-11-23T03:12:18.2261943Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2262071Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2262408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2262540Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2262909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2263032Z getattr(self, test_name)() 2022-11-23T03:12:18.2263390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2263486Z fn() 2022-11-23T03:12:18.2263830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2264390Z test(self, **param_kwargs) 2022-11-23T03:12:18.2265074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2265294Z return func(*args, **kwargs) 2022-11-23T03:12:18.2265770Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2265964Z self.run_subtests( 2022-11-23T03:12:18.2266545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2266719Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2267067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2267221Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2267593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2267713Z output = model(*input) 2022-11-23T03:12:18.2268036Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2268176Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2268552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2268816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2269227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2269332Z _lazy_init(state, module) 2022-11-23T03:12:18.2269682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2269824Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2270160Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2270282Z return func(*args, **kwargs) 2022-11-23T03:12:18.2270658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2270760Z p_assert( 2022-11-23T03:12:18.2271094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2271206Z traceback.print_stack() 2022-11-23T03:12:18.2271337Z File "", line 1, in 2022-11-23T03:12:18.2271546Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2271686Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2271888Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2272036Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2272248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2272333Z self.run() 2022-11-23T03:12:18.2272534Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2272677Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2273014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2273150Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2273511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2273633Z getattr(self, test_name)() 2022-11-23T03:12:18.2273990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2274071Z fn() 2022-11-23T03:12:18.2274435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2274556Z test(self, **param_kwargs) 2022-11-23T03:12:18.2274909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2275032Z return func(*args, **kwargs) 2022-11-23T03:12:18.2275281Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2275396Z self.run_subtests( 2022-11-23T03:12:18.2275811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2275963Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2276329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2276484Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2276857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2276976Z output = model(*input) 2022-11-23T03:12:18.2277299Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2277437Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2277811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2278019Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2278384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2278504Z _lazy_init(state, module) 2022-11-23T03:12:18.2278853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2278996Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2279329Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2279451Z return func(*args, **kwargs) 2022-11-23T03:12:18.2279828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2279913Z p_assert( 2022-11-23T03:12:18.2280245Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2280379Z traceback.print_stack() 2022-11-23T03:12:18.2280614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2280848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2281080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2281308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2281437Z File "", line 1, in 2022-11-23T03:12:18.2281628Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2281768Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2281968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2282116Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2282331Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2282435Z self.run() 2022-11-23T03:12:18.2282637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2282762Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2283103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2283234Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2283594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2283717Z getattr(self, test_name)() 2022-11-23T03:12:18.2284073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2284168Z fn() 2022-11-23T03:12:18.2284528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2284685Z test(self, **param_kwargs) 2022-11-23T03:12:18.2285049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2285171Z return func(*args, **kwargs) 2022-11-23T03:12:18.2285421Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2285535Z self.run_subtests( 2022-11-23T03:12:18.2285884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2286044Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2286405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2286546Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2286921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2287090Z output = model(*input) 2022-11-23T03:12:18.2287418Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2287559Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2287933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2288107Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2288469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2288572Z _lazy_init(state, module) 2022-11-23T03:12:18.2288923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2289066Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2289411Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2289535Z return func(*args, **kwargs) 2022-11-23T03:12:18.2289911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2290012Z p_assert( 2022-11-23T03:12:18.2290346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2290455Z traceback.print_stack() 2022-11-23T03:12:18.2290584Z File "", line 1, in 2022-11-23T03:12:18.2290797Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2290939Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2291139Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2291288Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2291505Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2291608Z self.run() 2022-11-23T03:12:18.2291795Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2291940Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2292277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2292407Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2292764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2292885Z getattr(self, test_name)() 2022-11-23T03:12:18.2293240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2293320Z fn() 2022-11-23T03:12:18.2293684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2293853Z test(self, **param_kwargs) 2022-11-23T03:12:18.2294218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2294339Z return func(*args, **kwargs) 2022-11-23T03:12:18.2294589Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2294701Z self.run_subtests( 2022-11-23T03:12:18.2295051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2295192Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2295556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2295708Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2296135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2296254Z output = model(*input) 2022-11-23T03:12:18.2296575Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2296714Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2297085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2297242Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2297606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2297727Z _lazy_init(state, module) 2022-11-23T03:12:18.2298076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2298216Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2298559Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2298683Z return func(*args, **kwargs) 2022-11-23T03:12:18.2299057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2299157Z p_assert( 2022-11-23T03:12:18.2299473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2299596Z traceback.print_stack() 2022-11-23T03:12:18.2299723Z File "", line 1, in 2022-11-23T03:12:18.2299932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2300072Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2300272Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2300421Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2300620Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2300724Z self.run() 2022-11-23T03:12:18.2300924Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2301066Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2301401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2301531Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2301889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2302010Z getattr(self, test_name)() 2022-11-23T03:12:18.2302350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2302446Z fn() 2022-11-23T03:12:18.2302806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2302977Z test(self, **param_kwargs) 2022-11-23T03:12:18.2303341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2303462Z return func(*args, **kwargs) 2022-11-23T03:12:18.2303713Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2303823Z self.run_subtests( 2022-11-23T03:12:18.2304746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2305036Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2305748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2306025Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2306544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2306663Z output = model(*input) 2022-11-23T03:12:18.2306985Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2307126Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2307555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2307730Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2308096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2308217Z _lazy_init(state, module) 2022-11-23T03:12:18.2308568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2308713Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2309054Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2309177Z return func(*args, **kwargs) 2022-11-23T03:12:18.2309532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2309634Z p_assert( 2022-11-23T03:12:18.2309969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2310094Z traceback.print_stack() 2022-11-23T03:12:18.2310221Z File "", line 1, in 2022-11-23T03:12:18.2310430Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2310570Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2310753Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2310910Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2311128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2311230Z self.run() 2022-11-23T03:12:18.2311493Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2311639Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2311976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2312108Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2312447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2312569Z getattr(self, test_name)() 2022-11-23T03:12:18.2312924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2313021Z fn() 2022-11-23T03:12:18.2313461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2313597Z test(self, **param_kwargs) 2022-11-23T03:12:18.2313954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2314076Z return func(*args, **kwargs) 2022-11-23T03:12:18.2314312Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2314426Z self.run_subtests( 2022-11-23T03:12:18.2314777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2314938Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2315299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2315449Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2315876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2315995Z output = model(*input) 2022-11-23T03:12:18.2316303Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2316444Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2316817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2316992Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2317356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2317476Z _lazy_init(state, module) 2022-11-23T03:12:18.2317821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2317968Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2318288Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2318414Z return func(*args, **kwargs) 2022-11-23T03:12:18.2318795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2318896Z p_assert( 2022-11-23T03:12:18.2319228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2319354Z traceback.print_stack() 2022-11-23T03:12:18.2319590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2319822Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2320152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2320388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2320522Z File "", line 1, in 2022-11-23T03:12:18.2320730Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2320871Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2321072Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2321220Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2321429Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2321515Z self.run() 2022-11-23T03:12:18.2321714Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2321861Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2322200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2322335Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2322736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2322863Z getattr(self, test_name)() 2022-11-23T03:12:18.2323204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2323302Z fn() 2022-11-23T03:12:18.2323661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2323783Z test(self, **param_kwargs) 2022-11-23T03:12:18.2324139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2324262Z return func(*args, **kwargs) 2022-11-23T03:12:18.2324514Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2324676Z self.run_subtests( 2022-11-23T03:12:18.2325015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2325178Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2325543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2325696Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2326115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2326233Z output = model(*input) 2022-11-23T03:12:18.2326569Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2326708Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2327062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2327245Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2327614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2327734Z _lazy_init(state, module) 2022-11-23T03:12:18.2328083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2328224Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2328555Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2328678Z return func(*args, **kwargs) 2022-11-23T03:12:18.2329036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2329137Z p_assert( 2022-11-23T03:12:18.2329469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2329600Z traceback.print_stack() 2022-11-23T03:12:18.2329730Z File "", line 1, in 2022-11-23T03:12:18.2329938Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2330076Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2330278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2330410Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2330621Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2330723Z self.run() 2022-11-23T03:12:18.2330923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2331068Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2331405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2331538Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2331946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2332059Z getattr(self, test_name)() 2022-11-23T03:12:18.2332419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2332514Z fn() 2022-11-23T03:12:18.2332878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2333001Z test(self, **param_kwargs) 2022-11-23T03:12:18.2333351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2333473Z return func(*args, **kwargs) 2022-11-23T03:12:18.2333705Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2333869Z self.run_subtests( 2022-11-23T03:12:18.2334222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2334384Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2334746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2334898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2335272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2335390Z output = model(*input) 2022-11-23T03:12:18.2335713Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2335835Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2336209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2336392Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2336756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2336876Z _lazy_init(state, module) 2022-11-23T03:12:18.2337221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2337362Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2337693Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2337799Z return func(*args, **kwargs) 2022-11-23T03:12:18.2338173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2338273Z p_assert( 2022-11-23T03:12:18.2338603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2338733Z traceback.print_stack() 2022-11-23T03:12:18.2338862Z File "", line 1, in 2022-11-23T03:12:18.2339068Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2339192Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2339393Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2339540Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2339761Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2339864Z self.run() 2022-11-23T03:12:18.2340063Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2340207Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2340545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2340663Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2341069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2341199Z getattr(self, test_name)() 2022-11-23T03:12:18.2341557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2341653Z fn() 2022-11-23T03:12:18.2342015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2342137Z test(self, **param_kwargs) 2022-11-23T03:12:18.2342490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2342596Z return func(*args, **kwargs) 2022-11-23T03:12:18.2342844Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2343002Z self.run_subtests( 2022-11-23T03:12:18.2343354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2343517Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2344352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2344645Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2345367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2345559Z output = model(*input) 2022-11-23T03:12:18.2346182Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2346379Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2346764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2346953Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2347319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2347437Z _lazy_init(state, module) 2022-11-23T03:12:18.2347785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2347909Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2348241Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2348362Z return func(*args, **kwargs) 2022-11-23T03:12:18.2348740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2348842Z p_assert( 2022-11-23T03:12:18.2349179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2349305Z traceback.print_stack() 2022-11-23T03:12:18.2349436Z File "", line 1, in 2022-11-23T03:12:18.2349626Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2349765Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2349965Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2350114Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2350326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2350428Z self.run() 2022-11-23T03:12:18.2350628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2350754Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2351092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2351305Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2351680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2351803Z getattr(self, test_name)() 2022-11-23T03:12:18.2352157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2352253Z fn() 2022-11-23T03:12:18.2352613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2352717Z test(self, **param_kwargs) 2022-11-23T03:12:18.2353068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2353191Z return func(*args, **kwargs) 2022-11-23T03:12:18.2353441Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2353620Z self.run_subtests( 2022-11-23T03:12:18.2353975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2354137Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2354500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2354633Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2355005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2355124Z output = model(*input) 2022-11-23T03:12:18.2355448Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2355587Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2355957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2356137Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2356502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2356605Z _lazy_init(state, module) 2022-11-23T03:12:18.2356954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2357096Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2357433Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2357555Z return func(*args, **kwargs) 2022-11-23T03:12:18.2357930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2358030Z p_assert( 2022-11-23T03:12:18.2358369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2358477Z traceback.print_stack() 2022-11-23T03:12:18.2358716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2358985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2359250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2359483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2359613Z File "", line 1, in 2022-11-23T03:12:18.2359821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2359962Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2360145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2360301Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2360562Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2360673Z self.run() 2022-11-23T03:12:18.2360875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2361022Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2361369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2361483Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2361842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2361964Z getattr(self, test_name)() 2022-11-23T03:12:18.2362317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2362413Z fn() 2022-11-23T03:12:18.2362841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2362963Z test(self, **param_kwargs) 2022-11-23T03:12:18.2363317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2363422Z return func(*args, **kwargs) 2022-11-23T03:12:18.2363673Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2363785Z self.run_subtests( 2022-11-23T03:12:18.2364134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2364293Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2364657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2364806Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2365188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2365290Z output = model(*input) 2022-11-23T03:12:18.2365615Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2365756Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2366127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2366301Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2366665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2366784Z _lazy_init(state, module) 2022-11-23T03:12:18.2367132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2367279Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2367600Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2367723Z return func(*args, **kwargs) 2022-11-23T03:12:18.2368100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2368200Z p_assert( 2022-11-23T03:12:18.2368536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2368661Z traceback.print_stack() 2022-11-23T03:12:18.2368788Z File "", line 1, in 2022-11-23T03:12:18.2368982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2369158Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2369360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2369516Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2369775Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2369885Z self.run() 2022-11-23T03:12:18.2370090Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2370235Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2370556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2370689Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2371047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2371168Z getattr(self, test_name)() 2022-11-23T03:12:18.2371522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2371618Z fn() 2022-11-23T03:12:18.2372034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2372156Z test(self, **param_kwargs) 2022-11-23T03:12:18.2372495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2372617Z return func(*args, **kwargs) 2022-11-23T03:12:18.2372867Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2372978Z self.run_subtests( 2022-11-23T03:12:18.2373324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2373485Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2373847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2374004Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2374363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2374481Z output = model(*input) 2022-11-23T03:12:18.2374804Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2374942Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2375316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2375491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2375851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2375969Z _lazy_init(state, module) 2022-11-23T03:12:18.2376299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2376444Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2376784Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2376907Z return func(*args, **kwargs) 2022-11-23T03:12:18.2377279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2377380Z p_assert( 2022-11-23T03:12:18.2377714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2377837Z traceback.print_stack() 2022-11-23T03:12:18.2377948Z File "", line 1, in 2022-11-23T03:12:18.2378155Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2378296Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2378498Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2378652Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2378918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2379029Z self.run() 2022-11-23T03:12:18.2379213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2379358Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2379695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2379826Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2380185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2380306Z getattr(self, test_name)() 2022-11-23T03:12:18.2380659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2380805Z fn() 2022-11-23T03:12:18.2381154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2381277Z test(self, **param_kwargs) 2022-11-23T03:12:18.2381632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2381754Z return func(*args, **kwargs) 2022-11-23T03:12:18.2382006Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2382116Z self.run_subtests( 2022-11-23T03:12:18.2382466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2382629Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2382973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2383129Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2383507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2383628Z output = model(*input) 2022-11-23T03:12:18.2384293Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2384560Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2385279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2385601Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2386223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2386352Z _lazy_init(state, module) 2022-11-23T03:12:18.2386703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2386854Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2387193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2387315Z return func(*args, **kwargs) 2022-11-23T03:12:18.2387690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2387792Z p_assert( 2022-11-23T03:12:18.2388108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2388233Z traceback.print_stack() 2022-11-23T03:12:18.2388361Z File "", line 1, in 2022-11-23T03:12:18.2388570Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2388712Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2388913Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2389145Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2389352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2389458Z self.run() 2022-11-23T03:12:18.2389658Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2389803Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2390141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2390270Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2390626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2390746Z getattr(self, test_name)() 2022-11-23T03:12:18.2391084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2391242Z fn() 2022-11-23T03:12:18.2391608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2391731Z test(self, **param_kwargs) 2022-11-23T03:12:18.2392086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2392208Z return func(*args, **kwargs) 2022-11-23T03:12:18.2392456Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2392568Z self.run_subtests( 2022-11-23T03:12:18.2392897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2393059Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2393421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2393581Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2393955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2394074Z output = model(*input) 2022-11-23T03:12:18.2394399Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2394539Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2394895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2395073Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2395435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2395554Z _lazy_init(state, module) 2022-11-23T03:12:18.2395906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2396054Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2396388Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2396510Z return func(*args, **kwargs) 2022-11-23T03:12:18.2396866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2396969Z p_assert( 2022-11-23T03:12:18.2397303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2397426Z traceback.print_stack() 2022-11-23T03:12:18.2397661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2397894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2398124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2398444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2398563Z File "", line 1, in 2022-11-23T03:12:18.2398772Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2398912Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2399114Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2399264Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2399473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2399573Z self.run() 2022-11-23T03:12:18.2399776Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2399906Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2400246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2400434Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2400793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2400917Z getattr(self, test_name)() 2022-11-23T03:12:18.2401274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2401372Z fn() 2022-11-23T03:12:18.2401713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2401835Z test(self, **param_kwargs) 2022-11-23T03:12:18.2402186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2402311Z return func(*args, **kwargs) 2022-11-23T03:12:18.2402562Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2402680Z self.run_subtests( 2022-11-23T03:12:18.2403026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2403187Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2403547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2403682Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2404053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2404170Z output = model(*input) 2022-11-23T03:12:18.2404492Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2404632Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2405006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2405184Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2405544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2405647Z _lazy_init(state, module) 2022-11-23T03:12:18.2405993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2406134Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2406469Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2406591Z return func(*args, **kwargs) 2022-11-23T03:12:18.2406962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2407062Z p_assert( 2022-11-23T03:12:18.2407442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2407556Z traceback.print_stack() 2022-11-23T03:12:18.2407685Z File "", line 1, in 2022-11-23T03:12:18.2407895Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2408035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2408234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2408384Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2408597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2408682Z self.run() 2022-11-23T03:12:18.2408882Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2409028Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2409367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2409553Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2409914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2410034Z getattr(self, test_name)() 2022-11-23T03:12:18.2410389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2410469Z fn() 2022-11-23T03:12:18.2410829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2410953Z test(self, **param_kwargs) 2022-11-23T03:12:18.2411307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2411487Z return func(*args, **kwargs) 2022-11-23T03:12:18.2411742Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2411861Z self.run_subtests( 2022-11-23T03:12:18.2412211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2412356Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2412717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2412868Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2413241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2413359Z output = model(*input) 2022-11-23T03:12:18.2413485Z File "", line 1, in 2022-11-23T03:12:18.2413809Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2413951Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2414144Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2414285Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2414655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2414828Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2415035Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2415185Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2415549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2415671Z _lazy_init(state, module) 2022-11-23T03:12:18.2415864Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2415971Z self.run() 2022-11-23T03:12:18.2416364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2416512Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2416713Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2416858Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2417197Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2417304Z return func(*args, **kwargs) 2022-11-23T03:12:18.2417638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2417768Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2418145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2418245Z p_assert( 2022-11-23T03:12:18.2418666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2418789Z getattr(self, test_name)() 2022-11-23T03:12:18.2419123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2419230Z traceback.print_stack() 2022-11-23T03:12:18.2419584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2419682Z fn() 2022-11-23T03:12:18.2420042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2420164Z test(self, **param_kwargs) 2022-11-23T03:12:18.2420515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2420640Z return func(*args, **kwargs) 2022-11-23T03:12:18.2420887Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2420988Z self.run_subtests( 2022-11-23T03:12:18.2421340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2421505Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2421864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2422015Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2422388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2422507Z output = model(*input) 2022-11-23T03:12:18.2422833Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2422956Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2423338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2423513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2424083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2424306Z _lazy_init(state, module) 2022-11-23T03:12:18.2426010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2426554Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2427572Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2427820Z return func(*args, **kwargs) 2022-11-23T03:12:18.2428687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2428935Z p_assert( 2022-11-23T03:12:18.2430147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2430455Z traceback.print_stack() 2022-11-23T03:12:18.2430777Z File "", line 1, in 2022-11-23T03:12:18.2431000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2431152Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2431341Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2431502Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2431719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2431827Z self.run() 2022-11-23T03:12:18.2432036Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2432187Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2432783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2432897Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2433258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2433383Z getattr(self, test_name)() 2022-11-23T03:12:18.2433743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2433844Z fn() 2022-11-23T03:12:18.2434202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2434496Z test(self, **param_kwargs) 2022-11-23T03:12:18.2434861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2434968Z return func(*args, **kwargs) 2022-11-23T03:12:18.2435236Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2435355Z self.run_subtests( 2022-11-23T03:12:18.2435796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2435887Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2436258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2436416Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2436799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2436899Z output = model(*input) 2022-11-23T03:12:18.2437231Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2437376Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2437922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2438100Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2438459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2438581Z _lazy_init(state, module) 2022-11-23T03:12:18.2438925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2439047Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2439377Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2439501Z return func(*args, **kwargs) 2022-11-23T03:12:18.2440102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2440164Z p_assert( 2022-11-23T03:12:18.2440560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2440701Z traceback.print_stack() 2022-11-23T03:12:18.2441014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2441322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2441478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2441718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2441859Z File "", line 1, in 2022-11-23T03:12:18.2442079Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2442225Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2442512Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2442648Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2442844Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2442953Z self.run() 2022-11-23T03:12:18.2443158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2443310Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2443659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2443857Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2444323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2444452Z getattr(self, test_name)() 2022-11-23T03:12:18.2444785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2444890Z fn() 2022-11-23T03:12:18.2445256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2445383Z test(self, **param_kwargs) 2022-11-23T03:12:18.2445734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2445862Z return func(*args, **kwargs) 2022-11-23T03:12:18.2446115Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2446229Z self.run_subtests( 2022-11-23T03:12:18.2446550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2446713Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2447069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2447226Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2447596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2447721Z output = model(*input) 2022-11-23T03:12:18.2448038Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2448180Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2448525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2448702Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2449063Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2449189Z _lazy_init(state, module) 2022-11-23T03:12:18.2449533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2449728Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2450073Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2450199Z return func(*args, **kwargs) 2022-11-23T03:12:18.2450550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2450655Z p_assert( 2022-11-23T03:12:18.2450990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2451117Z traceback.print_stack() 2022-11-23T03:12:18.2451247Z File "", line 1, in 2022-11-23T03:12:18.2451458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2451601Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2451830Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2451988Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2452204Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2452311Z self.run() 2022-11-23T03:12:18.2452513Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2452659Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2452995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2453131Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2453460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2453585Z getattr(self, test_name)() 2022-11-23T03:12:18.2453937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2454042Z fn() 2022-11-23T03:12:18.2454403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2454530Z test(self, **param_kwargs) 2022-11-23T03:12:18.2454879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2455006Z return func(*args, **kwargs) 2022-11-23T03:12:18.2455234Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2455349Z self.run_subtests( 2022-11-23T03:12:18.2455691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2455852Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2456210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2456369Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2456737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2456859Z output = model(*input) 2022-11-23T03:12:18.2457156Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2457298Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2457665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2457841Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2458198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2458322Z _lazy_init(state, module) 2022-11-23T03:12:18.2458664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2458857Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2459176Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2459305Z return func(*args, **kwargs) 2022-11-23T03:12:18.2459854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2459964Z p_assert( 2022-11-23T03:12:18.2460101Z File "", line 1, in 2022-11-23T03:12:18.2460444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2460577Z traceback.print_stack() 2022-11-23T03:12:18.2460793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2460921Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2461262Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2461339Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2461560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2461668Z self.run() 2022-11-23T03:12:18.2461881Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2462034Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2462357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2462498Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2463010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2463139Z getattr(self, test_name)() 2022-11-23T03:12:18.2463552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2463698Z fn() 2022-11-23T03:12:18.2464377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2464522Z test(self, **param_kwargs) 2022-11-23T03:12:18.2464863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2465163Z return func(*args, **kwargs) 2022-11-23T03:12:18.2465420Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2465540Z self.run_subtests( 2022-11-23T03:12:18.2465893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2466061Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2466431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2466598Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2466957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2467082Z output = model(*input) 2022-11-23T03:12:18.2467414Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2467561Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2467942Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2468215Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2468625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2468750Z _lazy_init(state, module) 2022-11-23T03:12:18.2469171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2469336Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2469682Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2469811Z return func(*args, **kwargs) 2022-11-23T03:12:18.2470192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2470299Z p_assert( 2022-11-23T03:12:18.2470640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2470769Z traceback.print_stack() 2022-11-23T03:12:18.2470970Z File "", line 1, in 2022-11-23T03:12:18.2471096Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2471241Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2471521Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2471678Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2471895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2472002Z self.run() 2022-11-23T03:12:18.2472210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2472338Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2472685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2472913Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2473195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2473418Z getattr(self, test_name)() 2022-11-23T03:12:18.2473687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2473798Z fn() 2022-11-23T03:12:18.2474149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2474280Z test(self, **param_kwargs) 2022-11-23T03:12:18.2474709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2474848Z return func(*args, **kwargs) 2022-11-23T03:12:18.2475080Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2475145Z self.run_subtests( 2022-11-23T03:12:18.2475500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2475664Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2476009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2476176Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2476560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2476686Z output = model(*input) 2022-11-23T03:12:18.2477014Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2477160Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2477541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2477720Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2478093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2478196Z _lazy_init(state, module) 2022-11-23T03:12:18.2478602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2478755Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2479098Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2479226Z return func(*args, **kwargs) 2022-11-23T03:12:18.2479607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2479752Z p_assert( 2022-11-23T03:12:18.2480031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2480163Z traceback.print_stack() 2022-11-23T03:12:18.2480403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2480645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2480936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2481178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2481355Z File "", line 1, in 2022-11-23T03:12:18.2481529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2481652Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2481859Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2482013Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2482384Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2482533Z self.run() 2022-11-23T03:12:18.2482736Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2482881Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2483224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2483336Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2483691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2483813Z getattr(self, test_name)() 2022-11-23T03:12:18.2484163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2484261Z fn() 2022-11-23T03:12:18.2484615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2484737Z test(self, **param_kwargs) 2022-11-23T03:12:18.2485086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2485188Z return func(*args, **kwargs) 2022-11-23T03:12:18.2485443Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2485557Z self.run_subtests( 2022-11-23T03:12:18.2485901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2486061Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2486414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2486567Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2487111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2487213Z output = model(*input) 2022-11-23T03:12:18.2487544Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2487688Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2488145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2488338Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2488709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2488838Z _lazy_init(state, module) 2022-11-23T03:12:18.2489191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2489314Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2489653Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2489783Z return func(*args, **kwargs) 2022-11-23T03:12:18.2489916Z File "", line 1, in 2022-11-23T03:12:18.2490296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2490457Z p_assert( 2022-11-23T03:12:18.2490802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2490931Z traceback.print_stack() 2022-11-23T03:12:18.2491124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2491270Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2491479Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2491634Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2491851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2491958Z self.run() 2022-11-23T03:12:18.2492164Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2492291Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2492643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2492784Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2493155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2493284Z getattr(self, test_name)() 2022-11-23T03:12:18.2493644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2493745Z fn() 2022-11-23T03:12:18.2494116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2494222Z test(self, **param_kwargs) 2022-11-23T03:12:18.2494581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2494708Z return func(*args, **kwargs) 2022-11-23T03:12:18.2494974Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2495093Z self.run_subtests( 2022-11-23T03:12:18.2495449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2495612Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2495978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2496114Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2496572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2496618Z output = model(*input) 2022-11-23T03:12:18.2496951Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2497098Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2497521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2497709Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2498150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2498186Z _lazy_init(state, module) 2022-11-23T03:12:18.2498542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2498688Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2499186Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2499478Z return func(*args, **kwargs) 2022-11-23T03:12:18.2499858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2500017Z p_assert( 2022-11-23T03:12:18.2500364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2500472Z traceback.print_stack() 2022-11-23T03:12:18.2500604Z File "", line 1, in 2022-11-23T03:12:18.2500819Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2500968Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2501173Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2501328Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2501543Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2501652Z self.run() 2022-11-23T03:12:18.2501839Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2501994Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2502490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2502623Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2502975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2503099Z getattr(self, test_name)() 2022-11-23T03:12:18.2503449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2503526Z fn() 2022-11-23T03:12:18.2504083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2504222Z test(self, **param_kwargs) 2022-11-23T03:12:18.2504574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2504703Z return func(*args, **kwargs) 2022-11-23T03:12:18.2504955Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2505070Z self.run_subtests( 2022-11-23T03:12:18.2505413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2505609Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2505910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2506065Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2506437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2506555Z output = model(*input) 2022-11-23T03:12:18.2537476Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2537702Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2538423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2538598Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2538974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2539099Z _lazy_init(state, module) 2022-11-23T03:12:18.2539618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2539771Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2540116Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2540247Z return func(*args, **kwargs) 2022-11-23T03:12:18.2540672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2540799Z p_assert( 2022-11-23T03:12:18.2541238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2541371Z traceback.print_stack() 2022-11-23T03:12:18.2541507Z File "", line 1, in 2022-11-23T03:12:18.2541722Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2541868Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2542076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2542236Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2542432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2542545Z self.run() 2022-11-23T03:12:18.2542754Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2542909Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2543257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2543394Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2543759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2544341Z getattr(self, test_name)() 2022-11-23T03:12:18.2544656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2544760Z fn() 2022-11-23T03:12:18.2545131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2545261Z test(self, **param_kwargs) 2022-11-23T03:12:18.2545625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2545759Z return func(*args, **kwargs) 2022-11-23T03:12:18.2546124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2546142Z self.run_subtests( 2022-11-23T03:12:18.2546477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2546642Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2547011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2547170Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2547707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2547827Z output = model(*input) 2022-11-23T03:12:18.2548147Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2548292Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2548719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2548910Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2549271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2549394Z _lazy_init(state, module) 2022-11-23T03:12:18.2549734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2549878Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2550205Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2550330Z return func(*args, **kwargs) 2022-11-23T03:12:18.2550675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2550849Z p_assert( 2022-11-23T03:12:18.2551185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2551314Z traceback.print_stack() 2022-11-23T03:12:18.2551548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2551783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2552011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2552239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2552348Z File "", line 1, in 2022-11-23T03:12:18.2552554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2552696Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2552904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2553056Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2553266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2553372Z self.run() 2022-11-23T03:12:18.2553548Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2553697Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2554033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2554169Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2554520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2554645Z getattr(self, test_name)() 2022-11-23T03:12:18.2554995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2555100Z fn() 2022-11-23T03:12:18.2555433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2555558Z test(self, **param_kwargs) 2022-11-23T03:12:18.2555906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2556031Z return func(*args, **kwargs) 2022-11-23T03:12:18.2556278Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2556394Z self.run_subtests( 2022-11-23T03:12:18.2556734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2556895Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2557229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2557433Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2557812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2557932Z output = model(*input) 2022-11-23T03:12:18.2558250Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2558393Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2558757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2558932Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2559263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2559385Z _lazy_init(state, module) 2022-11-23T03:12:18.2559789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2560172Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2560450Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2560598Z return func(*args, **kwargs) 2022-11-23T03:12:18.2560961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2561072Z p_assert( 2022-11-23T03:12:18.2561389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2561581Z traceback.print_stack() 2022-11-23T03:12:18.2561652Z File "", line 1, in 2022-11-23T03:12:18.2561868Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2562019Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2562228Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2562418Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2562604Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2562690Z self.run() 2022-11-23T03:12:18.2563053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2563200Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2563702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2563839Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2564203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2564330Z getattr(self, test_name)() 2022-11-23T03:12:18.2564670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2564781Z fn() 2022-11-23T03:12:18.2565150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2565277Z test(self, **param_kwargs) 2022-11-23T03:12:18.2565636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2565767Z return func(*args, **kwargs) 2022-11-23T03:12:18.2566022Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2566140Z self.run_subtests( 2022-11-23T03:12:18.2566472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2566639Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2567008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2567220Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2567611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2567735Z output = model(*input) 2022-11-23T03:12:18.2568066Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2568210Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2568722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2568822Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2569188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2569364Z _lazy_init(state, module) 2022-11-23T03:12:18.2569726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2569873Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2570214Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2570342Z return func(*args, **kwargs) 2022-11-23T03:12:18.2570699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2570806Z p_assert( 2022-11-23T03:12:18.2571146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2571433Z traceback.print_stack() 2022-11-23T03:12:18.2571565Z File "", line 1, in 2022-11-23T03:12:18.2571773Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2571916Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2572120Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2572247Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2572455Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2572723Z self.run() 2022-11-23T03:12:18.2572930Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2573081Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2573427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2573564Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2573928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2574034Z getattr(self, test_name)() 2022-11-23T03:12:18.2574404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2574508Z fn() 2022-11-23T03:12:18.2574875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2575004Z test(self, **param_kwargs) 2022-11-23T03:12:18.2575363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2575491Z return func(*args, **kwargs) 2022-11-23T03:12:18.2575723Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2575840Z self.run_subtests( 2022-11-23T03:12:18.2576195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2576360Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2576777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2576940Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2577474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2577595Z output = model(*input) 2022-11-23T03:12:18.2578069Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2578214Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2578595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2578774Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2579139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2579316Z _lazy_init(state, module) 2022-11-23T03:12:18.2579676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2579902Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2580249Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2580354Z return func(*args, **kwargs) 2022-11-23T03:12:18.2580737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2580844Z p_assert( 2022-11-23T03:12:18.2581186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2581411Z traceback.print_stack() 2022-11-23T03:12:18.2581452Z File "", line 1, in 2022-11-23T03:12:18.2581666Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2581793Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2581998Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2582150Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2582369Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2582474Z self.run() 2022-11-23T03:12:18.2582830Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2582971Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2583304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2583415Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2583769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2584098Z getattr(self, test_name)() 2022-11-23T03:12:18.2584475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2584577Z fn() 2022-11-23T03:12:18.2585099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2585224Z test(self, **param_kwargs) 2022-11-23T03:12:18.2585581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2585687Z return func(*args, **kwargs) 2022-11-23T03:12:18.2585943Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2586063Z self.run_subtests( 2022-11-23T03:12:18.2586416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2586581Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2587036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2587203Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2587588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2587690Z output = model(*input) 2022-11-23T03:12:18.2588017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2588163Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2588541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2588723Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2589089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2589279Z _lazy_init(state, module) 2022-11-23T03:12:18.2589639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2589766Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2590104Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2590231Z return func(*args, **kwargs) 2022-11-23T03:12:18.2590615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2590723Z p_assert( 2022-11-23T03:12:18.2591061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2591195Z traceback.print_stack() 2022-11-23T03:12:18.2591437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2591658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2591900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2592135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2592274Z File "", line 1, in 2022-11-23T03:12:18.2592487Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2592633Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2592839Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2592996Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2593190Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2593301Z self.run() 2022-11-23T03:12:18.2593507Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2593665Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2594010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2594145Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2594505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2594610Z getattr(self, test_name)() 2022-11-23T03:12:18.2594966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2595063Z fn() 2022-11-23T03:12:18.2595427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2595553Z test(self, **param_kwargs) 2022-11-23T03:12:18.2595909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2596042Z return func(*args, **kwargs) 2022-11-23T03:12:18.2596350Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2596471Z self.run_subtests( 2022-11-23T03:12:18.2596906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2596979Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2597341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2597497Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2597870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2597995Z output = model(*input) 2022-11-23T03:12:18.2598321Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2598496Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2598879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2599059Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2599575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2599698Z _lazy_init(state, module) 2022-11-23T03:12:18.2600041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2600183Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2600508Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2600611Z return func(*args, **kwargs) 2022-11-23T03:12:18.2600979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2601094Z p_assert( 2022-11-23T03:12:18.2601426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2601552Z traceback.print_stack() 2022-11-23T03:12:18.2601675Z File "", line 1, in 2022-11-23T03:12:18.2601875Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2602016Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2602193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2602343Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2602554Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2602661Z self.run() 2022-11-23T03:12:18.2602861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2603009Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2603345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2603458Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2603813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2603938Z getattr(self, test_name)() 2022-11-23T03:12:18.2604288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2604385Z fn() 2022-11-23T03:12:18.2604737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2604854Z test(self, **param_kwargs) 2022-11-23T03:12:18.2605194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2605300Z return func(*args, **kwargs) 2022-11-23T03:12:18.2605597Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2605716Z self.run_subtests( 2022-11-23T03:12:18.2606061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2606222Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2606576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2606728Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2607091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2607188Z output = model(*input) 2022-11-23T03:12:18.2607533Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2607711Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2608250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2608431Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2608795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2608920Z _lazy_init(state, module) 2022-11-23T03:12:18.2609271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2609419Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2609739Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2609866Z return func(*args, **kwargs) 2022-11-23T03:12:18.2610257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2610363Z p_assert( 2022-11-23T03:12:18.2610700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2610827Z traceback.print_stack() 2022-11-23T03:12:18.2610954Z File "", line 1, in 2022-11-23T03:12:18.2611145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2611290Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2611495Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2611650Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2611867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2611980Z self.run() 2022-11-23T03:12:18.2612283Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2612347Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2612743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2612812Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2613174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2613310Z getattr(self, test_name)() 2022-11-23T03:12:18.2613670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2613932Z fn() 2022-11-23T03:12:18.2614289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2614415Z test(self, **param_kwargs) 2022-11-23T03:12:18.2614740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2614869Z return func(*args, **kwargs) 2022-11-23T03:12:18.2615170Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2615463Z self.run_subtests( 2022-11-23T03:12:18.2615821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2615983Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2616347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2616499Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2616948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2616976Z output = model(*input) 2022-11-23T03:12:18.2617303Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2617502Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2617882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2618221Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2618582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2618707Z _lazy_init(state, module) 2022-11-23T03:12:18.2619026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2619170Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2619500Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2619626Z return func(*args, **kwargs) 2022-11-23T03:12:18.2620002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2620108Z p_assert( 2022-11-23T03:12:18.2620434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2620563Z traceback.print_stack() 2022-11-23T03:12:18.2620671Z File "", line 1, in 2022-11-23T03:12:18.2621050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2621198Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2621404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2621560Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2621777Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2621890Z self.run() 2022-11-23T03:12:18.2622072Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2622229Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2622573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2622710Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2623074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2623200Z getattr(self, test_name)() 2022-11-23T03:12:18.2623566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2623670Z fn() 2022-11-23T03:12:18.2624211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2624356Z test(self, **param_kwargs) 2022-11-23T03:12:18.2624717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2624922Z return func(*args, **kwargs) 2022-11-23T03:12:18.2625230Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2625311Z self.run_subtests( 2022-11-23T03:12:18.2625667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2625828Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2626219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2626327Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2626704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2626831Z output = model(*input) 2022-11-23T03:12:18.2627160Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2627377Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2628085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2628265Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2628614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2628740Z _lazy_init(state, module) 2022-11-23T03:12:18.2629092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2629242Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2629580Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2629708Z return func(*args, **kwargs) 2022-11-23T03:12:18.2630153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2630205Z p_assert( 2022-11-23T03:12:18.2630523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2630971Z traceback.print_stack() 2022-11-23T03:12:18.2631217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2631456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2631691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2631930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2632065Z File "", line 1, in 2022-11-23T03:12:18.2632280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2632407Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2632618Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2632773Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2632988Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2633101Z self.run() 2022-11-23T03:12:18.2633306Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2633453Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2633936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2634065Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2634414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2634542Z getattr(self, test_name)() 2022-11-23T03:12:18.2634945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2635218Z fn() 2022-11-23T03:12:18.2635590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2635719Z test(self, **param_kwargs) 2022-11-23T03:12:18.2636056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2636187Z return func(*args, **kwargs) 2022-11-23T03:12:18.2636443Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2636562Z self.run_subtests( 2022-11-23T03:12:18.2636916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2637083Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2637271Z File "", line 1, in 2022-11-23T03:12:18.2637645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2637780Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2638156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2638274Z output = model(*input) 2022-11-23T03:12:18.2638640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2638779Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2639094Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2639234Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2639429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2639561Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2639927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2640105Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2640313Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2640417Z self.run() 2022-11-23T03:12:18.2641016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2641143Z _lazy_init(state, module) 2022-11-23T03:12:18.2641352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2641543Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2641837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2641989Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2642333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2642474Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2642815Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2642946Z return func(*args, **kwargs) 2022-11-23T03:12:18.2643307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2643412Z getattr(self, test_name)() 2022-11-23T03:12:18.2643793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2643895Z p_assert( 2022-11-23T03:12:18.2644245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2644339Z fn() 2022-11-23T03:12:18.2644877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2645011Z traceback.print_stack() 2022-11-23T03:12:18.2645345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2645640Z test(self, **param_kwargs) 2022-11-23T03:12:18.2646002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2646133Z return func(*args, **kwargs) 2022-11-23T03:12:18.2646390Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2646507Z self.run_subtests( 2022-11-23T03:12:18.2646860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2647027Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2647489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2647647Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2648025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2648150Z output = model(*input) 2022-11-23T03:12:18.2648635Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2648776Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2649141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2649315Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2649670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2649775Z _lazy_init(state, module) 2022-11-23T03:12:18.2650124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2650270Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2650607Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2650732Z return func(*args, **kwargs) 2022-11-23T03:12:18.2651102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2651207Z p_assert( 2022-11-23T03:12:18.2651533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2651638Z traceback.print_stack() 2022-11-23T03:12:18.2651769Z File "", line 1, in 2022-11-23T03:12:18.2651978Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2652126Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2652325Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2652477Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2652685Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2652769Z self.run() 2022-11-23T03:12:18.2652972Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2653120Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2653631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2653768Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2654130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2654261Z getattr(self, test_name)() 2022-11-23T03:12:18.2654683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2654771Z fn() 2022-11-23T03:12:18.2655142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2655272Z test(self, **param_kwargs) 2022-11-23T03:12:18.2655634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2655762Z return func(*args, **kwargs) 2022-11-23T03:12:18.2656017Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2656134Z self.run_subtests( 2022-11-23T03:12:18.2656635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2656822Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2657183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2657339Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2657708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2657829Z output = model(*input) 2022-11-23T03:12:18.2658145Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2658285Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2658647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2658801Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2659159Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2659289Z _lazy_init(state, module) 2022-11-23T03:12:18.2659630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2659769Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2660092Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2660213Z return func(*args, **kwargs) 2022-11-23T03:12:18.2660758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2660914Z p_assert( 2022-11-23T03:12:18.2661181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2661311Z traceback.print_stack() 2022-11-23T03:12:18.2661445Z File "", line 1, in 2022-11-23T03:12:18.2661660Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.2661889Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.2662014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.2662147Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.2662362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.2662472Z self.run() 2022-11-23T03:12:18.2662680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.2662832Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.2663173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.2663466Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.2663820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.2664131Z getattr(self, test_name)() 2022-11-23T03:12:18.2664565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.2664698Z fn() 2022-11-23T03:12:18.2665031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.2665156Z test(self, **param_kwargs) 2022-11-23T03:12:18.2665500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.2665624Z return func(*args, **kwargs) 2022-11-23T03:12:18.2666054Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T03:12:18.2666150Z self.run_subtests( 2022-11-23T03:12:18.2666505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.2666735Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.2667107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.2667263Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.2667637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.2667761Z output = model(*input) 2022-11-23T03:12:18.2668090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.2668212Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.2668651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.2668833Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.2669201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.2669334Z _lazy_init(state, module) 2022-11-23T03:12:18.2669684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.2669828Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.2670165Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.2670272Z return func(*args, **kwargs) 2022-11-23T03:12:18.2670653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.2670761Z p_assert( 2022-11-23T03:12:18.2671099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.2671229Z traceback.print_stack() 2022-11-23T03:12:18.2671470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2671718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2672200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2672319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2672542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2672768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2673157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2673389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2673620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2673849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2674139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2674356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2674585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2674808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2675034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2675323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2675480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2675704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2675973Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2676193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2676400Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2676623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2676849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2677070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2677293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2677664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2677872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2678094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2679032Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2679778Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2680519Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2681242Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2681969Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2682743Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2683629Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2684336Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2685039Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2685785Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2686482Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2687193Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2687432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2687841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2688137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2688378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2688610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2688835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2689058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2689285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2689500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2689721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2689953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2690183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2690407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2690625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2690854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2691230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2691602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2691877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2692116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2692344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2692577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2692798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2693016Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2693244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2693450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2693725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2693948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2694166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2694395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2694614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2694833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2695058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2695390Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2695495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2695721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2695942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2696171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2696398Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2696625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2696933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2697071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2697274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2697503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2697738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2697961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2698177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2698404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2698627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2699374Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2700467Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2701218Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2701950Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2702690Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2703605Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2704495Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2705212Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2706011Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2706628Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2707337Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2708034Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2708247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2708634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2708871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2709099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2709393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2709634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2709865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2710096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2710320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2710526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2710757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2710986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2711207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2711489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2711716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2711934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2712163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2712468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2712654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2712819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.2713085Z dist init r=0, world=4 2022-11-23T03:12:18.2713414Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2713732Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2714034Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2714327Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2714614Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2714911Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2715190Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2715475Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2716027Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2716246Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2716544Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2716898Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.2717024Z dist init r=1, world=4 2022-11-23T03:12:18.2717355Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2717678Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2717985Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2718285Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2718575Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2718934Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2719241Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2719544Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2719839Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2720193Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2720501Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2720957Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.2721070Z dist init r=2, world=4 2022-11-23T03:12:18.2721558Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2721868Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2722168Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2722460Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2722768Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2723064Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2723360Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2723665Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2724017Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2724320Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2724762Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2725055Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.2725166Z dist init r=3, world=4 2022-11-23T03:12:18.2725477Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2725828Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2726121Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2726421Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2726719Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2727016Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2727307Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2727595Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2727889Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2728348Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2728642Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2728946Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.2729051Z ok (6.523s) 2022-11-23T03:12:18.2729248Z test_delayed_reduce_scatter_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2730155Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82399 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:12:18.2730387Z test_delayed_reduce_scatter_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T03:12:18.2731311Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82403 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:12:18.2731673Z test_mixture_of_experts_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12900 2022-11-23T03:12:18.2731891Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12901 2022-11-23T03:12:18.2732101Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12902 2022-11-23T03:12:18.2732322Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12903 2022-11-23T03:12:18.2732849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2733021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2733379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2733596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2733960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2734135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2734499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2734675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2735029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2735206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2735746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2735921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2736285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2736463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2736840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2737024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2737273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.2737519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.2737763Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.2738009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.2738400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2738956Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2739332Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2739703Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2739927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.2740136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.2740359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.2740624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.2741908Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2741988Z warnings.warn( 2022-11-23T03:12:18.2743016Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2743176Z warnings.warn( 2022-11-23T03:12:18.2744356Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2744476Z warnings.warn( 2022-11-23T03:12:18.2744714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.2745852Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2745971Z warnings.warn( 2022-11-23T03:12:18.2746203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.2746427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.2746665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.2747049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2747427Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2747800Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2748157Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2748391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.2748610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.2748843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.2749221Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2749439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.2749887Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2750273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2750632Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2750841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.2751070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.2751301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.2751534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.2751907Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2752346Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2752714Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2753074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2753799Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2754520Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2755240Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2755949Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2756187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.2756403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.2756644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.2756880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.2757265Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2757648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2758023Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2758386Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2758620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.2758900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.2759135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.2759496Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2759714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.2760093Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2760651Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2761032Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2761318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.2761554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.2761797Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.2762214Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2762399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.2762777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2763166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2763715Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2764442Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2765159Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2765877Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2766787Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2767510Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2768289Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2769094Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2769825Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2770565Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2771340Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2772219Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2772930Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2773173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.2773572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.2773805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.2774207Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2774423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.2774820Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2775214Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2775637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2775950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.2776087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.2776324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.2776708Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2777101Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2777336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.2777927Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2778307Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2778719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.2778959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.2779192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.2779587Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2779916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.2780273Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2780672Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2781047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2781790Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2782523Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2783420Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2784494Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2784741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.2784981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.2785222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.2785628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2786021Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2786257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.2786652Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2787044Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2787414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.2787703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.2788119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.2788516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2788899Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2789140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.2789530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2789908Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2790213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.2790425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.2790656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.2791039Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2791284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.2791676Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2792055Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2792448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2792690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.2792925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.2793157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.2793527Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2793757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.2794149Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2794540Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2794930Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2795673Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2796403Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2797187Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2797933Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2798654Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2799396Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2800176Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2800904Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2801627Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2802362Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2803223Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2803927Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2804170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.2804391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.2804619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.2804985Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2805219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.2805594Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2806023Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2806415Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2806636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.2806859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.2807074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.2807458Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2807670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.2808192Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2808467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2809018Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2809264Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.2809498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.2809727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.2810108Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2810347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.2810745Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2811112Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2811499Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2811726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.2811962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.2812194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.2812581Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2812839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.2813372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2813745Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2814100Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2814318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.2814544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.2814771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.2815186Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2815427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.2815805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2816347Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2816736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2816970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.2817182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.2817453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.2817848Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2818085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.2818461Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2819006Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2819384Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2819612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.2819833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.2820035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.2820414Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2820645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.2821011Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2821387Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2821938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2822176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.2822412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.2822633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.2823001Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2823241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.2823634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2824338Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2824987Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2825200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.2825440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.2825763Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.2826051Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2826270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.2826625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2826998Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2827611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2827837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.2828071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.2828367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.2828697Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2828925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.2829315Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2829688Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2830077Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2830312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.2830532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.2830764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.2831151Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2831379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.2831816Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2832137Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2832503Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2832609Z dist init r=3, world=4 2022-11-23T03:12:18.2832724Z dist init r=1, world=4 2022-11-23T03:12:18.2832841Z dist init r=2, world=4 2022-11-23T03:12:18.2833114Z dist init r=0, world=4 2022-11-23T03:12:18.2833219Z ok (7.626s) 2022-11-23T03:12:18.2833556Z test_mixture_of_experts_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13681 2022-11-23T03:12:18.2833766Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13682 2022-11-23T03:12:18.2833958Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13683 2022-11-23T03:12:18.2834217Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13684 2022-11-23T03:12:18.2834598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2834773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2835132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2835326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2835683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2836027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2836386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2836630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2836989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2837169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2837550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2837739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2838095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2838273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2838647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2838824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2839063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.2839458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.2839864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.2840109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.2840509Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2840898Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2841338Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2841735Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2841945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.2842253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.2842394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.2842626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.2843692Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2843811Z warnings.warn( 2022-11-23T03:12:18.2844058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.2845061Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2845173Z warnings.warn( 2022-11-23T03:12:18.2846176Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2846432Z warnings.warn( 2022-11-23T03:12:18.2846737Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.2847690Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2847800Z warnings.warn( 2022-11-23T03:12:18.2848019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.2848256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.2848647Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2849032Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2849395Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2849771Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2850003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.2850230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.2850466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.2850676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.2851057Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2851429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2851788Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2852162Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2852394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.2852677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.2852911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.2853284Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2853492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.2853868Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2854239Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2854595Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2855371Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2856094Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2856810Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2857520Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2857761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.2857993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.2858216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.2858602Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2858839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.2859265Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2859583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2859948Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2860187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.2860421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.2860654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.2861263Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2861565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.2861877Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2862268Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2862633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2862872Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.2863111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.2863348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.2863725Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2864202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.2864605Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2864977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2865365Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2866115Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2866850Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2867592Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2868329Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2868619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.2868863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.2869107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.2869506Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2869722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.2870106Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2870499Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2870957Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2871196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.2871435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.2871674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.2872053Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2872306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.2872837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2873178Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2873601Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2874002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.2874274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.2874471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.2874869Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2875098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.2875493Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2875953Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2876372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2877009Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2877755Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2878647Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2879538Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2879784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.2880024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.2880243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.2880689Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2880938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.2881319Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2881715Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2882105Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2882322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.2882556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.2882822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.2883219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2883615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.2883992Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2884353Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2884727Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2884956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.2885168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.2885550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.2885942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2886187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.2886577Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2886953Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2887341Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2888093Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2889166Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2889888Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2890137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.2890421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.2890641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.2891040Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2891268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.2891660Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2892056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2892432Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2893228Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2893475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.2893713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.2893930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.2894306Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2894547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.2894950Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2895678Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2896698Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2896915Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T03:12:18.2897295Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2897785Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2898435Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2898680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.2898904Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.2899184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.2899590Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2899814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.2900347Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2900728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2901108Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2901323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.2901607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.2901830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.2902054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.2902436Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2902792Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2903158Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2903537Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2903770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.2904159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.2904389Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.2904770Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2905004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.2905363Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2905739Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2906098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.2906824Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2907542Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2908261Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2908595Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.2908829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.2909054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.2909619Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2909862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.2910236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2910626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2911098Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.2911317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.2911552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.2911783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.2912160Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2912404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.2912796Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2913237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2913576Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.2913817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.2914184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.2914408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.2914785Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2915021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.2915397Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2915767Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2916147Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.2916556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.2916775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.2916985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.2917376Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2917601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.2918109Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2918443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2918832Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.2919709Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2919945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.2920221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.2920449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.2920659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.2921020Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2921398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2921775Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2922308Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.2922544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.2922783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.2923013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.2923241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.2923631Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2923999Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2924376Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2924767Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.2925517Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2926262Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2926493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.2926883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.2927157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.2927535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2927771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.2928152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2928509Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2929061Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.2929178Z dist init r=3, world=4 2022-11-23T03:12:18.2929275Z dist init r=2, world=4 2022-11-23T03:12:18.2929386Z dist init r=1, world=4 2022-11-23T03:12:18.2929549Z dist init r=0, world=4 2022-11-23T03:12:18.2929654Z ok (7.727s) 2022-11-23T03:12:18.2930000Z test_mixture_of_experts_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14462 2022-11-23T03:12:18.2930206Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14463 2022-11-23T03:12:18.2930489Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14464 2022-11-23T03:12:18.2930641Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14465 2022-11-23T03:12:18.2931003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2931181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2931563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2931746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2932115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2932272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2932655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2932835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2933203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2933381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2933760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2933956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2934481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.2934640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.2934986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.2935171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.2935409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.2935632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.2935867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.2936099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.2936712Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2937106Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2937495Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2937864Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.2938096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.2938326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.2938556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.2938817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.2939982Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2940097Z warnings.warn( 2022-11-23T03:12:18.2941087Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2941244Z warnings.warn( 2022-11-23T03:12:18.2941484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.2942654Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2942755Z warnings.warn( 2022-11-23T03:12:18.2942975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.2943217Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.2944408Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.2944533Z warnings.warn( 2022-11-23T03:12:18.2944758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.2945161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2945553Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2946318Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2946720Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.2946962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.2947190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.2947410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.2947808Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2948046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.2948424Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2949037Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2949413Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.2949645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.2949875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.2950093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.2950449Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2950677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.2951054Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2951414Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2951787Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.2952513Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2953221Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2954129Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2954867Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2955112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.2955353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.2955628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.2956036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2956256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.2956806Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2957168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2957542Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.2957772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.2958048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.2958266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.2958647Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2958875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.2959235Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2959590Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2959961Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.2960200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.2960414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.2960696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.2961024Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2961239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.2961893Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2962181Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2962549Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.2963297Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2964022Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2965052Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2965734Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2965925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.2966222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.2966382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.2966936Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2967222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.2967605Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2967998Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2968385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.2968652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.2968892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.2969118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.2969511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2969753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.2970128Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2970517Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2970900Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.2971145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.2971361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.2971580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.2971819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.2972223Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2972619Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2973157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2973524Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.2974408Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2975200Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2975953Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2976692Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2977065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.2977256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.2977424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.2977917Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2978114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.2978452Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2978845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2979242Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.2979481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.2979701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.2979931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.2980324Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2980545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.2980936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2981319Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2981709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.2981948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.2982181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.2982399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.2982789Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2983029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.2983447Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2984262Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2984646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.2985281Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2985980Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2986778Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2987018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.2987235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.2987461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.2987845Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2988135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.2988515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2989051Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2989444Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.2990187Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2990430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.2990673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.2990894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.2991287Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2991511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.2992061Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2992444Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2993418Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2994444Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:18.2994652Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T03:12:18.2995024Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.2995760Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.2996059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.2996296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.2996532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.2996931Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2997175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.2997568Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2997966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2998359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.2998578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.2998810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.2999046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.2999468Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.2999682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.3000085Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3000628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3001273Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3001427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.3001640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.3001873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.3002262Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3002502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.3002981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3003337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3003879Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3004601Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3005372Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3006097Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3006334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.3006564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.3006783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.3007152Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3007393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.3007775Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3008154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3008533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3008862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.3008995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.3009221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.3009774Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3009996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.3010390Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3010780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3011170Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3011410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.3011644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.3011924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.3012322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3012564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.3012931Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3013359Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3013772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3014146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.3014416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.3014643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.3015021Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3015254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.3015631Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3016074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3016362Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3017273Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3017517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.3017754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.3017989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.3018219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.3018611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3019000Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3019397Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3019870Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3020003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.3020274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.3020508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.3020899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3021297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.3021709Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3022082Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3022629Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3023373Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3024294Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3024609Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.3024825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.3025058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.3025619Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3025842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.3026204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3026594Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3026974Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3027073Z dist init r=1, world=4 2022-11-23T03:12:18.3027167Z dist init r=2, world=4 2022-11-23T03:12:18.3027255Z dist init r=0, world=4 2022-11-23T03:12:18.3027349Z dist init r=3, world=4 2022-11-23T03:12:18.3027433Z ok (7.626s) 2022-11-23T03:12:18.3027776Z test_mixture_of_experts_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15243 2022-11-23T03:12:18.3027976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15244 2022-11-23T03:12:18.3028188Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15245 2022-11-23T03:12:18.3028400Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15246 2022-11-23T03:12:18.3028772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3029026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3029471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3029666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3030033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3030213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3030687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3030786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3031219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3031383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3031750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3031925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3032297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3032489Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3033022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3033210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3033498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.3033739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.3033954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.3034188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.3034577Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3034956Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3035329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3035700Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3035930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.3036153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.3036373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.3036743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.3037763Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3037889Z warnings.warn( 2022-11-23T03:12:18.3038137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.3039143Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3039259Z warnings.warn( 2022-11-23T03:12:18.3040467Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3040591Z warnings.warn( 2022-11-23T03:12:18.3041638Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3041749Z warnings.warn( 2022-11-23T03:12:18.3042215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.3042405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.3042700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.3043151Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3043477Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3043612Z File "", line 1, in 2022-11-23T03:12:18.3043829Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3043977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3044186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3044342Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3044560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3044649Z self.run() 2022-11-23T03:12:18.3044858Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3045007Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3045354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3045494Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3045860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3045987Z getattr(self, test_name)() 2022-11-23T03:12:18.3046353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3046589Z fn() 2022-11-23T03:12:18.3046948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3047071Z test(self, **param_kwargs) 2022-11-23T03:12:18.3047426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3047552Z return func(*args, **kwargs) 2022-11-23T03:12:18.3047794Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3047909Z self.run_subtests( 2022-11-23T03:12:18.3048254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3048392Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3048744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3048895Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3049260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3049380Z output = model(*input) 2022-11-23T03:12:18.3049748Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3049895Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3050267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3050419Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3050777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3050898Z _lazy_init(state, module) 2022-11-23T03:12:18.3051244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3051386Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3051714Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3051888Z return func(*args, **kwargs) 2022-11-23T03:12:18.3052422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3052508Z p_assert( 2022-11-23T03:12:18.3052845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3052972Z traceback.print_stack() 2022-11-23T03:12:18.3053367Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3053494Z File "", line 1, in 2022-11-23T03:12:18.3053707Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3053849Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3054050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3054181Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3054402Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3054505Z self.run() 2022-11-23T03:12:18.3054707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3054851Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3055346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3055478Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3055827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3055930Z getattr(self, test_name)() 2022-11-23T03:12:18.3056281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3056379Z fn() 2022-11-23T03:12:18.3056731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3056861Z test(self, **param_kwargs) 2022-11-23T03:12:18.3057206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3057330Z return func(*args, **kwargs) 2022-11-23T03:12:18.3057549Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3057661Z self.run_subtests( 2022-11-23T03:12:18.3058000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3058162Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3058601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3058755Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3059110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3059236Z output = model(*input) 2022-11-23T03:12:18.3059538Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3059679Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3060043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3060213Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3060737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3060873Z _lazy_init(state, module) 2022-11-23T03:12:18.3061222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3061415Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3061769Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3061981Z return func(*args, **kwargs) 2022-11-23T03:12:18.3062260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3062368Z p_assert( 2022-11-23T03:12:18.3062707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3062837Z traceback.print_stack() 2022-11-23T03:12:18.3063237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3063369Z File "", line 1, in 2022-11-23T03:12:18.3063562Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3063707Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3064108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3064273Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3064490Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3064599Z self.run() 2022-11-23T03:12:18.3064808Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3064957Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3065287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3065425Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3065789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3065916Z getattr(self, test_name)() 2022-11-23T03:12:18.3066277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3066384Z fn() 2022-11-23T03:12:18.3066754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3066882Z test(self, **param_kwargs) 2022-11-23T03:12:18.3067216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3067347Z return func(*args, **kwargs) 2022-11-23T03:12:18.3067595Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3067711Z self.run_subtests( 2022-11-23T03:12:18.3068070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3068235Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3068647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3068999Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3069370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3069588Z output = model(*input) 2022-11-23T03:12:18.3069922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3070067Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3070454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3070580Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3070994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3071091Z _lazy_init(state, module) 2022-11-23T03:12:18.3071525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3071584Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3071924Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3072052Z return func(*args, **kwargs) 2022-11-23T03:12:18.3072434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3072542Z p_assert( 2022-11-23T03:12:18.3072953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3073008Z traceback.print_stack() 2022-11-23T03:12:18.3073194Z File "", line 1, in 2022-11-23T03:12:18.3073331Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3073477Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3073692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3073846Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3074061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3074170Z self.run() 2022-11-23T03:12:18.3074353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3074505Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3074905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3074987Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3075348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3075475Z getattr(self, test_name)() 2022-11-23T03:12:18.3075836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3075945Z fn() 2022-11-23T03:12:18.3076291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3076417Z test(self, **param_kwargs) 2022-11-23T03:12:18.3076775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3077007Z return func(*args, **kwargs) 2022-11-23T03:12:18.3077153Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3077274Z self.run_subtests( 2022-11-23T03:12:18.3077626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3077791Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3078132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3078340Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3078729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3078850Z output = model(*input) 2022-11-23T03:12:18.3079175Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3079319Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3079695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3079873Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3080305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3080348Z _lazy_init(state, module) 2022-11-23T03:12:18.3080762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3080911Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3081251Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3081379Z return func(*args, **kwargs) 2022-11-23T03:12:18.3081762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3081956Z p_assert( 2022-11-23T03:12:18.3082284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3082393Z traceback.print_stack() 2022-11-23T03:12:18.3082629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.3082905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.3083171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.3083410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.3083721Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3083856Z File "", line 1, in 2022-11-23T03:12:18.3084046Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3084192Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3084398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3084556Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3084773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3084880Z self.run() 2022-11-23T03:12:18.3085089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3085238Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3085559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3085697Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3086059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3086186Z getattr(self, test_name)() 2022-11-23T03:12:18.3086544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3086646Z fn() 2022-11-23T03:12:18.3087010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3087136Z test(self, **param_kwargs) 2022-11-23T03:12:18.3087527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3087663Z return func(*args, **kwargs) 2022-11-23T03:12:18.3087910Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3088028Z self.run_subtests( 2022-11-23T03:12:18.3088383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3088547Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3088910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3089068Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3089422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3089593Z output = model(*input) 2022-11-23T03:12:18.3089928Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3090073Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3090450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3090630Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3090995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3091117Z _lazy_init(state, module) 2022-11-23T03:12:18.3091447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3091593Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3091931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3092063Z return func(*args, **kwargs) 2022-11-23T03:12:18.3092447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3092551Z p_assert( 2022-11-23T03:12:18.3092894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3093023Z traceback.print_stack() 2022-11-23T03:12:18.3093398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3093793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3094178Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3094310Z File "", line 1, in 2022-11-23T03:12:18.3094530Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3094679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3094887Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3095041Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3095235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3095342Z self.run() 2022-11-23T03:12:18.3095544Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3095692Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3096032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3096167Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3096529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3096659Z getattr(self, test_name)() 2022-11-23T03:12:18.3097046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3097160Z fn() 2022-11-23T03:12:18.3097531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3097658Z test(self, **param_kwargs) 2022-11-23T03:12:18.3098014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3098201Z return func(*args, **kwargs) 2022-11-23T03:12:18.3098393Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3098512Z self.run_subtests( 2022-11-23T03:12:18.3098843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3099058Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3099423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3099573Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3099939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3100056Z output = model(*input) 2022-11-23T03:12:18.3100379Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3100519Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3100873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3101047Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3101412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3101611Z _lazy_init(state, module) 2022-11-23T03:12:18.3101889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3102030Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3102363Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3102486Z return func(*args, **kwargs) 2022-11-23T03:12:18.3102843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3102944Z p_assert( 2022-11-23T03:12:18.3103279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3103403Z traceback.print_stack() 2022-11-23T03:12:18.3103533Z File "", line 1, in 2022-11-23T03:12:18.3103743Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3104082Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3104278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3104428Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3104639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3104743Z self.run() 2022-11-23T03:12:18.3104946Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3105092Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3105441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3105577Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3105923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3106053Z getattr(self, test_name)() 2022-11-23T03:12:18.3106495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3106607Z fn() 2022-11-23T03:12:18.3106978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3107166Z test(self, **param_kwargs) 2022-11-23T03:12:18.3107462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3107590Z return func(*args, **kwargs) 2022-11-23T03:12:18.3107700Z File "", line 1, in 2022-11-23T03:12:18.3107944Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3108056Z self.run_subtests( 2022-11-23T03:12:18.3108404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3108632Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3108839Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3108980Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3109346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3109483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3109684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3109833Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3110203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3110320Z output = model(*input) 2022-11-23T03:12:18.3110531Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3110638Z self.run() 2022-11-23T03:12:18.3110947Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3111087Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3111290Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3111436Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3111809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3111984Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3112318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3112448Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3112793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3112921Z _lazy_init(state, module) 2022-11-23T03:12:18.3113281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3113405Z getattr(self, test_name)() 2022-11-23T03:12:18.3113753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3113983Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3114249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3114345Z fn() 2022-11-23T03:12:18.3114664Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3114788Z return func(*args, **kwargs) 2022-11-23T03:12:18.3115156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3115282Z test(self, **param_kwargs) 2022-11-23T03:12:18.3115702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3115811Z p_assert( 2022-11-23T03:12:18.3116167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3116289Z return func(*args, **kwargs) 2022-11-23T03:12:18.3116604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3116732Z traceback.print_stack() 2022-11-23T03:12:18.3116982Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3117097Z self.run_subtests( 2022-11-23T03:12:18.3117447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3117654Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3118017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3118169Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3118524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3118643Z output = model(*input) 2022-11-23T03:12:18.3118966Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3119105Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3119479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3119653Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3120021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3120205Z _lazy_init(state, module) 2022-11-23T03:12:18.3120474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3120616Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3120953Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3121076Z return func(*args, **kwargs) 2022-11-23T03:12:18.3121455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3121555Z p_assert( 2022-11-23T03:12:18.3121886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3122011Z traceback.print_stack() 2022-11-23T03:12:18.3122237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.3122488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.3122728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.3122963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.3123362Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3123753Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3123882Z File "", line 1, in 2022-11-23T03:12:18.3124094Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3124235Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3124472Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3124631Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3124843Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3124945Z self.run() 2022-11-23T03:12:18.3125148Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3125292Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3125633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3125750Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3126106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3126228Z getattr(self, test_name)() 2022-11-23T03:12:18.3126584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3126727Z fn() 2022-11-23T03:12:18.3127093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3127216Z test(self, **param_kwargs) 2022-11-23T03:12:18.3127569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3127675Z return func(*args, **kwargs) 2022-11-23T03:12:18.3127921Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3128051Z self.run_subtests( 2022-11-23T03:12:18.3128385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3128547Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3128908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3129065Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3129438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3129541Z output = model(*input) 2022-11-23T03:12:18.3129863Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3130001Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3130373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3130547Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3130911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3131030Z _lazy_init(state, module) 2022-11-23T03:12:18.3131384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3131509Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3131847Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3131969Z return func(*args, **kwargs) 2022-11-23T03:12:18.3132346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3132446Z p_assert( 2022-11-23T03:12:18.3132778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3132902Z traceback.print_stack() 2022-11-23T03:12:18.3133031Z File "", line 1, in 2022-11-23T03:12:18.3133379Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3133524Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3133766Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3133921Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3134130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3134235Z self.run() 2022-11-23T03:12:18.3134434Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3134556Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3134888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3135007Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3135358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3135483Z getattr(self, test_name)() 2022-11-23T03:12:18.3135902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3135963Z fn() 2022-11-23T03:12:18.3136319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3136420Z test(self, **param_kwargs) 2022-11-23T03:12:18.3136766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3136889Z return func(*args, **kwargs) 2022-11-23T03:12:18.3137311Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3137429Z self.run_subtests( 2022-11-23T03:12:18.3137766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3137930Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3138288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3138429Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3138809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3138933Z output = model(*input) 2022-11-23T03:12:18.3139261Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3139409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3139828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3140066Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3140375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3140498Z _lazy_init(state, module) 2022-11-23T03:12:18.3140839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3140985Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3141382Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3141514Z return func(*args, **kwargs) 2022-11-23T03:12:18.3141896Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3142001Z p_assert( 2022-11-23T03:12:18.3142337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3142534Z traceback.print_stack() 2022-11-23T03:12:18.3142848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3143294Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3143431Z File "", line 1, in 2022-11-23T03:12:18.3143651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3143796Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3144190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3144353Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3144547Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3144657Z self.run() 2022-11-23T03:12:18.3144867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3145016Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3145150Z File "", line 1, in 2022-11-23T03:12:18.3145502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3145719Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3146089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3146196Z getattr(self, test_name)() 2022-11-23T03:12:18.3146409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3146645Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3146918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3147021Z fn() 2022-11-23T03:12:18.3147225Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3147380Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3147900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3148006Z test(self, **param_kwargs) 2022-11-23T03:12:18.3148217Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3148323Z self.run() 2022-11-23T03:12:18.3148673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3148797Z return func(*args, **kwargs) 2022-11-23T03:12:18.3148997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3149142Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3149362Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3149475Z self.run_subtests( 2022-11-23T03:12:18.3149804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3149936Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3150285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3150447Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3150798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3150923Z getattr(self, test_name)() 2022-11-23T03:12:18.3151255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3151405Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3151752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3151851Z fn() 2022-11-23T03:12:18.3152219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3152337Z output = model(*input) 2022-11-23T03:12:18.3152756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3152889Z test(self, **param_kwargs) 2022-11-23T03:12:18.3153184Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3153326Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3153677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3153803Z return func(*args, **kwargs) 2022-11-23T03:12:18.3154168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3154343Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3154581Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3154741Z self.run_subtests( 2022-11-23T03:12:18.3155080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3155201Z _lazy_init(state, module) 2022-11-23T03:12:18.3155540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3155700Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3156037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3156181Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3156533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3156683Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3156991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3157121Z return func(*args, **kwargs) 2022-11-23T03:12:18.3157485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3157605Z output = model(*input) 2022-11-23T03:12:18.3157970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3158072Z p_assert( 2022-11-23T03:12:18.3158386Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3158526Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3158828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3158955Z traceback.print_stack() 2022-11-23T03:12:18.3159320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3159500Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3159855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3159978Z _lazy_init(state, module) 2022-11-23T03:12:18.3160316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3160456Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3160762Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3160887Z return func(*args, **kwargs) 2022-11-23T03:12:18.3161256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3161405Z p_assert( 2022-11-23T03:12:18.3161684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3161859Z traceback.print_stack() 2022-11-23T03:12:18.3162380Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.3162532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.3162756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.3162997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.3163407Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3163585Z File "", line 1, in 2022-11-23T03:12:18.3163758Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3163904Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3164163Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3164319Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3164512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3164620Z self.run() 2022-11-23T03:12:18.3164828Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3164978Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3165324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3165461Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3165825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3165951Z getattr(self, test_name)() 2022-11-23T03:12:18.3166287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3166396Z fn() 2022-11-23T03:12:18.3166765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3167048Z test(self, **param_kwargs) 2022-11-23T03:12:18.3167397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3167697Z return func(*args, **kwargs) 2022-11-23T03:12:18.3167951Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3168067Z self.run_subtests( 2022-11-23T03:12:18.3168398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3168672Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3168980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3169143Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3169520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3169644Z output = model(*input) 2022-11-23T03:12:18.3169969Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3170112Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3170469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3170648Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3171016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3171140Z _lazy_init(state, module) 2022-11-23T03:12:18.3171555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3171731Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3172054Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3172184Z return func(*args, **kwargs) 2022-11-23T03:12:18.3172539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3172645Z p_assert( 2022-11-23T03:12:18.3172982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3173111Z traceback.print_stack() 2022-11-23T03:12:18.3173665Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3174053Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3174487Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3174616Z File "", line 1, in 2022-11-23T03:12:18.3174723Z File "", line 1, in 2022-11-23T03:12:18.3175102Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3175247Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3175453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3175606Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3175819Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3175967Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3176288Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3176288Z self.run() 2022-11-23T03:12:18.3176491Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3176648Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3176853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3177004Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3177318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3177331Z self.run() 2022-11-23T03:12:18.3177650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3177787Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3177992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3178140Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3178505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3178637Z getattr(self, test_name)() 2022-11-23T03:12:18.3178979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3179113Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3179383Z File "", line 1, in 2022-11-23T03:12:18.3179732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3179832Z fn() 2022-11-23T03:12:18.3180359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3180484Z getattr(self, test_name)() 2022-11-23T03:12:18.3180695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3180839Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3181300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3181412Z test(self, **param_kwargs) 2022-11-23T03:12:18.3181774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3181876Z fn() 2022-11-23T03:12:18.3182078Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3182231Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3182590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3182716Z return func(*args, **kwargs) 2022-11-23T03:12:18.3183083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3183187Z test(self, **param_kwargs) 2022-11-23T03:12:18.3183402Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3183565Z self.run() 2022-11-23T03:12:18.3183815Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3184283Z self.run_subtests( 2022-11-23T03:12:18.3184648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3184773Z return func(*args, **kwargs) 2022-11-23T03:12:18.3184952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3185097Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3185440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3185598Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3186012Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3186135Z self.run_subtests( 2022-11-23T03:12:18.3186479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3186615Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3186961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3187117Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3187469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3187634Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3187997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3188124Z getattr(self, test_name)() 2022-11-23T03:12:18.3188497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3188784Z output = model(*input) 2022-11-23T03:12:18.3189117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3189447Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3189803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3189905Z fn() 2022-11-23T03:12:18.3190231Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3190375Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3190750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3190874Z output = model(*input) 2022-11-23T03:12:18.3191218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3191434Z test(self, **param_kwargs) 2022-11-23T03:12:18.3191834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3192014Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3192343Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3192485Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3192843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3192971Z return func(*args, **kwargs) 2022-11-23T03:12:18.3193316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3193440Z _lazy_init(state, module) 2022-11-23T03:12:18.3193893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3194070Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3194318Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3194435Z self.run_subtests( 2022-11-23T03:12:18.3194783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3195034Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3195296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3195399Z _lazy_init(state, module) 2022-11-23T03:12:18.3195752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3195922Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3196265Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3196394Z return func(*args, **kwargs) 2022-11-23T03:12:18.3196762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3196918Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3197270Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3197395Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3197773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3197879Z p_assert( 2022-11-23T03:12:18.3198257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3198382Z output = model(*input) 2022-11-23T03:12:18.3198735Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3198855Z return func(*args, **kwargs) 2022-11-23T03:12:18.3199172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3199303Z traceback.print_stack() 2022-11-23T03:12:18.3199631Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3199778Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3200156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3200262Z p_assert( 2022-11-23T03:12:18.3200653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3200864Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3201368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3201472Z traceback.print_stack() 2022-11-23T03:12:18.3201912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3201954Z _lazy_init(state, module) 2022-11-23T03:12:18.3202293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3202433Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3202761Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3202884Z return func(*args, **kwargs) 2022-11-23T03:12:18.3203229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3203384Z p_assert( 2022-11-23T03:12:18.3203716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3203841Z traceback.print_stack() 2022-11-23T03:12:18.3204083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.3204321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.3204558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.3204791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.3205180Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3205542Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3205677Z File "", line 1, in 2022-11-23T03:12:18.3205882Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3206023Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3206224Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3206371Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3206582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3206688Z self.run() 2022-11-23T03:12:18.3206865Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3207007Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3207339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3207544Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3207832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3207954Z getattr(self, test_name)() 2022-11-23T03:12:18.3208303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3208379Z fn() 2022-11-23T03:12:18.3208735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3208859Z test(self, **param_kwargs) 2022-11-23T03:12:18.3209205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3209332Z return func(*args, **kwargs) 2022-11-23T03:12:18.3209572Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3209689Z self.run_subtests( 2022-11-23T03:12:18.3210246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3210397Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3210765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3210920Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3211295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3211417Z output = model(*input) 2022-11-23T03:12:18.3211742Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3211886Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3212261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3212474Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3212845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3212970Z _lazy_init(state, module) 2022-11-23T03:12:18.3213321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3213480Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3213810Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3213939Z return func(*args, **kwargs) 2022-11-23T03:12:18.3214327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3214402Z p_assert( 2022-11-23T03:12:18.3214897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3215027Z traceback.print_stack() 2022-11-23T03:12:18.3215161Z File "", line 1, in 2022-11-23T03:12:18.3215369Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3215512Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3215709Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3215860Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3216047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3216151Z self.run() 2022-11-23T03:12:18.3216353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3216496Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3216827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3216966Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3217321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3217615Z getattr(self, test_name)() 2022-11-23T03:12:18.3217956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3218057Z fn() 2022-11-23T03:12:18.3218425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3218552Z test(self, **param_kwargs) 2022-11-23T03:12:18.3218914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3219045Z return func(*args, **kwargs) 2022-11-23T03:12:18.3219294Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3219394Z self.run_subtests( 2022-11-23T03:12:18.3219794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3219967Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3220567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3220642Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3221002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3221122Z output = model(*input) 2022-11-23T03:12:18.3221440Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3221559Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3221924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3222154Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3222514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3222637Z _lazy_init(state, module) 2022-11-23T03:12:18.3222979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3223120Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3223630Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3223759Z return func(*args, **kwargs) 2022-11-23T03:12:18.3224311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3224421Z p_assert( 2022-11-23T03:12:18.3224761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3224898Z traceback.print_stack() 2022-11-23T03:12:18.3225302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3225699Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3225835Z File "", line 1, in 2022-11-23T03:12:18.3225968Z File "", line 1, in 2022-11-23T03:12:18.3226159Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3226305Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3226512Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3226668Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3227037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3227184Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3227394Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3227478Z self.run() 2022-11-23T03:12:18.3227673Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3227822Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3228020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3228161Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3228409Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3228474Z self.run() 2022-11-23T03:12:18.3228809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3228920Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3229118Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3229340Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3229885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3230013Z getattr(self, test_name)() 2022-11-23T03:12:18.3230355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3230490Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3230851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3230932Z fn() 2022-11-23T03:12:18.3231293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3231418Z getattr(self, test_name)() 2022-11-23T03:12:18.3231784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3231980Z test(self, **param_kwargs) 2022-11-23T03:12:18.3232340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3232439Z fn() 2022-11-23T03:12:18.3232777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3232907Z return func(*args, **kwargs) 2022-11-23T03:12:18.3233270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3233397Z test(self, **param_kwargs) 2022-11-23T03:12:18.3233647Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3233764Z self.run_subtests( 2022-11-23T03:12:18.3234117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3234246Z return func(*args, **kwargs) 2022-11-23T03:12:18.3234731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3234893Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3235128Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3235242Z self.run_subtests( 2022-11-23T03:12:18.3235589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3235741Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3236079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3236238Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3236582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3236706Z output = model(*input) 2022-11-23T03:12:18.3237059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3237209Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3237703Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3237852Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3238229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3238354Z output = model(*input) 2022-11-23T03:12:18.3238709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3238887Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3239280Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3239434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3239803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3239927Z _lazy_init(state, module) 2022-11-23T03:12:18.3240303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3240640Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3240981Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3241101Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3241509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3241680Z _lazy_init(state, module) 2022-11-23T03:12:18.3242012Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3242137Z return func(*args, **kwargs) 2022-11-23T03:12:18.3242470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3242611Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3243166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3243250Z p_assert( 2022-11-23T03:12:18.3243633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3243719Z return func(*args, **kwargs) 2022-11-23T03:12:18.3244057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3244192Z traceback.print_stack() 2022-11-23T03:12:18.3244575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3244680Z p_assert( 2022-11-23T03:12:18.3244992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3245122Z traceback.print_stack() 2022-11-23T03:12:18.3245371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.3245618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.3245863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.3246103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.3246503Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3246644Z File "", line 1, in 2022-11-23T03:12:18.3246857Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3247015Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3247193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3247347Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3247561Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3247669Z self.run() 2022-11-23T03:12:18.3247875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3248026Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3248349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3248492Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3248902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3249036Z getattr(self, test_name)() 2022-11-23T03:12:18.3249643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3249658Z fn() 2022-11-23T03:12:18.3250003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3250125Z test(self, **param_kwargs) 2022-11-23T03:12:18.3250450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3250574Z return func(*args, **kwargs) 2022-11-23T03:12:18.3250815Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3250977Z self.run_subtests( 2022-11-23T03:12:18.3251324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3251485Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3251838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3251987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3252329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3252451Z output = model(*input) 2022-11-23T03:12:18.3252768Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3252908Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3253272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3253452Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3253811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3253932Z _lazy_init(state, module) 2022-11-23T03:12:18.3254430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3254579Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3254918Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3255045Z return func(*args, **kwargs) 2022-11-23T03:12:18.3255423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3255530Z p_assert( 2022-11-23T03:12:18.3255870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3256006Z traceback.print_stack() 2022-11-23T03:12:18.3256385Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3256783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3257324Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3257454Z File "", line 1, in 2022-11-23T03:12:18.3257580Z File "", line 1, in 2022-11-23T03:12:18.3257787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3257929Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3258126Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3258258Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3258509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3258657Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3258867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3258972Z self.run() 2022-11-23T03:12:18.3259171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3259320Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3259517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3259639Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3259846Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3259950Z self.run() 2022-11-23T03:12:18.3260290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3260466Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3260668Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3260813Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3261145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3261269Z getattr(self, test_name)() 2022-11-23T03:12:18.3261600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3261735Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3262084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3262188Z fn() 2022-11-23T03:12:18.3262798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3262843Z getattr(self, test_name)() 2022-11-23T03:12:18.3263190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3263317Z test(self, **param_kwargs) 2022-11-23T03:12:18.3263671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3263770Z fn() 2022-11-23T03:12:18.3264311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3264443Z return func(*args, **kwargs) 2022-11-23T03:12:18.3264806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3264929Z test(self, **param_kwargs) 2022-11-23T03:12:18.3265156Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3265369Z self.run_subtests( 2022-11-23T03:12:18.3265788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3265912Z return func(*args, **kwargs) 2022-11-23T03:12:18.3266252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3266415Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3266656Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3266770Z self.run_subtests( 2022-11-23T03:12:18.3267099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3267249Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3267592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3267996Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3268389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3268514Z output = model(*input) 2022-11-23T03:12:18.3268933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3269089Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3269395Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3269540Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3269917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3270039Z output = model(*input) 2022-11-23T03:12:18.3270416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3270667Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3271000Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3271239Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3271494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3271619Z _lazy_init(state, module) 2022-11-23T03:12:18.3272071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3272175Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3272527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3272679Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3273046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3273171Z _lazy_init(state, module) 2022-11-23T03:12:18.3273591Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3273623Z return func(*args, **kwargs) 2022-11-23T03:12:18.3274121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3274262Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3274627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3274729Z p_assert( 2022-11-23T03:12:18.3275058Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3275183Z return func(*args, **kwargs) 2022-11-23T03:12:18.3275670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3275798Z traceback.print_stack() 2022-11-23T03:12:18.3276179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3276289Z p_assert( 2022-11-23T03:12:18.3276424Z File "", line 1, in 2022-11-23T03:12:18.3276767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3276897Z traceback.print_stack() 2022-11-23T03:12:18.3277109Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3277301Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3277442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3277681Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3277863Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3277978Z self.run() 2022-11-23T03:12:18.3278184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3278334Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3278678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3278792Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3279154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3279281Z getattr(self, test_name)() 2022-11-23T03:12:18.3279794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3279896Z fn() 2022-11-23T03:12:18.3280250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3280598Z test(self, **param_kwargs) 2022-11-23T03:12:18.3280936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3281064Z return func(*args, **kwargs) 2022-11-23T03:12:18.3281314Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3281431Z self.run_subtests( 2022-11-23T03:12:18.3281802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3281948Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3282311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3282468Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3282832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3282956Z output = model(*input) 2022-11-23T03:12:18.3283292Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3283439Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3283819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3283996Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3284368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3284744Z _lazy_init(state, module) 2022-11-23T03:12:18.3284998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3285122Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3285452Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3285577Z return func(*args, **kwargs) 2022-11-23T03:12:18.3285947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3286050Z p_assert( 2022-11-23T03:12:18.3286373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3286497Z traceback.print_stack() 2022-11-23T03:12:18.3286717Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.3286957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.3287192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.3287477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.3287876Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3288070Z File "", line 1, in 2022-11-23T03:12:18.3288278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3288419Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3288622Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3288749Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3288958Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3289064Z self.run() 2022-11-23T03:12:18.3289263Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3289408Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3289984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3290122Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3290465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3290595Z getattr(self, test_name)() 2022-11-23T03:12:18.3290953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3291056Z fn() 2022-11-23T03:12:18.3291422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3291548Z test(self, **param_kwargs) 2022-11-23T03:12:18.3291904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3292036Z return func(*args, **kwargs) 2022-11-23T03:12:18.3292271Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3292388Z self.run_subtests( 2022-11-23T03:12:18.3292743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3293064Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3293591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3293747Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3294122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3294247Z output = model(*input) 2022-11-23T03:12:18.3294552Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3294701Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3295081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3295261Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3295628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3295753Z _lazy_init(state, module) 2022-11-23T03:12:18.3296104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3296249Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3296566Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3296694Z return func(*args, **kwargs) 2022-11-23T03:12:18.3297070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3297274Z p_assert( 2022-11-23T03:12:18.3297566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3297701Z traceback.print_stack() 2022-11-23T03:12:18.3298105Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3298596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3298896Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3299109Z File "", line 1, in 2022-11-23T03:12:18.3299225Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3299371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3299630Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3299815Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3300008Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3300116Z self.run() 2022-11-23T03:12:18.3300299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3300447Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3300795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3300999Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3301298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3301425Z getattr(self, test_name)() 2022-11-23T03:12:18.3301783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3301887Z fn() 2022-11-23T03:12:18.3302246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3302363Z test(self, **param_kwargs) 2022-11-23T03:12:18.3302724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3302856Z return func(*args, **kwargs) 2022-11-23T03:12:18.3303105Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3303323Z self.run_subtests( 2022-11-23T03:12:18.3303571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3303736Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3304196Z File "", line 1, in 2022-11-23T03:12:18.3304592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3304737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3305176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3305276Z output = model(*input) 2022-11-23T03:12:18.3305430Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3305572Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3305702Z File "", line 1, in 2022-11-23T03:12:18.3306179Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3306323Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3306530Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3306683Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3307147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3307338Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3307551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3307699Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3308044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3308172Z _lazy_init(state, module) 2022-11-23T03:12:18.3308388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3308498Z self.run() 2022-11-23T03:12:18.3308701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3308857Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3309128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3309258Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3309617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3309762Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3309982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3310090Z self.run() 2022-11-23T03:12:18.3310429Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3310557Z return func(*args, **kwargs) 2022-11-23T03:12:18.3310895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3311009Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3311212Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3311364Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3311727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3311854Z getattr(self, test_name)() 2022-11-23T03:12:18.3312238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3312344Z p_assert( 2022-11-23T03:12:18.3312682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3312795Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3313155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3313256Z fn() 2022-11-23T03:12:18.3313594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3313727Z traceback.print_stack() 2022-11-23T03:12:18.3314189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3314249Z getattr(self, test_name)() 2022-11-23T03:12:18.3314644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3314768Z test(self, **param_kwargs) 2022-11-23T03:12:18.3315211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3315309Z fn() 2022-11-23T03:12:18.3315651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3315774Z return func(*args, **kwargs) 2022-11-23T03:12:18.3316124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3316425Z test(self, **param_kwargs) 2022-11-23T03:12:18.3316701Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3316826Z self.run_subtests( 2022-11-23T03:12:18.3317189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3317316Z return func(*args, **kwargs) 2022-11-23T03:12:18.3317670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3317835Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3318088Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3318206Z self.run_subtests( 2022-11-23T03:12:18.3318548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3318752Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3319112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3319282Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3319658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3319783Z output = model(*input) 2022-11-23T03:12:18.3320148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3320354Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3320666Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3320882Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3321188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3321322Z output = model(*input) 2022-11-23T03:12:18.3321852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3322025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3322342Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3322480Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3322836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3322936Z _lazy_init(state, module) 2022-11-23T03:12:18.3323300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3323471Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3324002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3324148Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3324513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3324636Z _lazy_init(state, module) 2022-11-23T03:12:18.3324974Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3325080Z return func(*args, **kwargs) 2022-11-23T03:12:18.3325472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3325574Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3326044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3326145Z p_assert( 2022-11-23T03:12:18.3326451Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3326585Z return func(*args, **kwargs) 2022-11-23T03:12:18.3326928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3327036Z traceback.print_stack() 2022-11-23T03:12:18.3327558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3327664Z p_assert( 2022-11-23T03:12:18.3327983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3328106Z traceback.print_stack() 2022-11-23T03:12:18.3328496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.3328745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.3329082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.3329305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.3329710Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3329845Z File "", line 1, in 2022-11-23T03:12:18.3330060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3330209Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3330417Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3330570Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3330787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3330881Z self.run() 2022-11-23T03:12:18.3331090Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3331240Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3331581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3331718Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3332080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3332208Z getattr(self, test_name)() 2022-11-23T03:12:18.3332549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3332649Z fn() 2022-11-23T03:12:18.3333015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3333143Z test(self, **param_kwargs) 2022-11-23T03:12:18.3333656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3333781Z return func(*args, **kwargs) 2022-11-23T03:12:18.3334021Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3334134Z self.run_subtests( 2022-11-23T03:12:18.3334454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3334614Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3334962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3335112Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3335472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3335589Z output = model(*input) 2022-11-23T03:12:18.3335954Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3336099Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3336444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3336616Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3336974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3337098Z _lazy_init(state, module) 2022-11-23T03:12:18.3337440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3337583Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3338091Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3338270Z return func(*args, **kwargs) 2022-11-23T03:12:18.3338636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3338745Z p_assert( 2022-11-23T03:12:18.3339083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3339212Z traceback.print_stack() 2022-11-23T03:12:18.3339613Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3340009Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3340397Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3340533Z File "", line 1, in 2022-11-23T03:12:18.3340901Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3341031Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3341233Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3341438Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3341645Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3341749Z self.run() 2022-11-23T03:12:18.3341950Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3342096Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3342408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3342545Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3342896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3343280Z getattr(self, test_name)() 2022-11-23T03:12:18.3343559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3343661Z fn() 2022-11-23T03:12:18.3344211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3344343Z test(self, **param_kwargs) 2022-11-23T03:12:18.3344684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3344811Z return func(*args, **kwargs) 2022-11-23T03:12:18.3345059Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3345177Z self.run_subtests( 2022-11-23T03:12:18.3345529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3345699Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3346130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3346295Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3346653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3346775Z output = model(*input) 2022-11-23T03:12:18.3347260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3347404Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3347769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3347942Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3348074Z File "", line 1, in 2022-11-23T03:12:18.3348503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3348605Z _lazy_init(state, module) 2022-11-23T03:12:18.3348809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3348954Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3349295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3349436Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3349636Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3349787Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3350117Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3350220Z return func(*args, **kwargs) 2022-11-23T03:12:18.3350429Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3350542Z self.run() 2022-11-23T03:12:18.3350910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3351013Z p_assert( 2022-11-23T03:12:18.3351216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3351361Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3351669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3351794Z traceback.print_stack() 2022-11-23T03:12:18.3352120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3352255Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3352602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3352728Z getattr(self, test_name)() 2022-11-23T03:12:18.3353076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3353175Z fn() 2022-11-23T03:12:18.3353508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3353632Z test(self, **param_kwargs) 2022-11-23T03:12:18.3353979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3354104Z return func(*args, **kwargs) 2022-11-23T03:12:18.3354233Z File "", line 1, in 2022-11-23T03:12:18.3354471Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3354585Z self.run_subtests( 2022-11-23T03:12:18.3354926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3355114Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3355328Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3355469Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3355827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3355979Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3356179Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3356331Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3356696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3356795Z output = model(*input) 2022-11-23T03:12:18.3357004Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3357161Z self.run() 2022-11-23T03:12:18.3357484Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3357623Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3357823Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3357970Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3358337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3358490Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3358818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3358950Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3359311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3359435Z _lazy_init(state, module) 2022-11-23T03:12:18.3359785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3359908Z getattr(self, test_name)() 2022-11-23T03:12:18.3360252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3360372Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3360722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3360822Z fn() 2022-11-23T03:12:18.3361335Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3361467Z return func(*args, **kwargs) 2022-11-23T03:12:18.3361829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3361952Z test(self, **param_kwargs) 2022-11-23T03:12:18.3362327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3362413Z p_assert( 2022-11-23T03:12:18.3362765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3362880Z return func(*args, **kwargs) 2022-11-23T03:12:18.3363304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3363331Z traceback.print_stack() 2022-11-23T03:12:18.3363569Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3363672Z self.run_subtests( 2022-11-23T03:12:18.3364003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3364246Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3364563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3364711Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3365076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3365186Z output = model(*input) 2022-11-23T03:12:18.3365499Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3365633Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3366149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3366303Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3366818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3366983Z _lazy_init(state, module) 2022-11-23T03:12:18.3367326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3367457Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3367784Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3367896Z return func(*args, **kwargs) 2022-11-23T03:12:18.3368269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3368354Z p_assert( 2022-11-23T03:12:18.3368725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3368845Z traceback.print_stack() 2022-11-23T03:12:18.3369092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.3369329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.3369608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.3369765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.3370154Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3370267Z File "", line 1, in 2022-11-23T03:12:18.3370468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3370600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3370795Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3370946Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3371152Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3371250Z self.run() 2022-11-23T03:12:18.3371433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3371569Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3371899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3372020Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3372425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3372486Z getattr(self, test_name)() 2022-11-23T03:12:18.3372833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3372921Z fn() 2022-11-23T03:12:18.3373262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3373441Z test(self, **param_kwargs) 2022-11-23T03:12:18.3373798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3374008Z return func(*args, **kwargs) 2022-11-23T03:12:18.3374311Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3374413Z self.run_subtests( 2022-11-23T03:12:18.3374739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3374885Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3375213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3375351Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3375942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3376060Z output = model(*input) 2022-11-23T03:12:18.3376375Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3376507Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3376873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3377039Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3377383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3377495Z _lazy_init(state, module) 2022-11-23T03:12:18.3377835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3378024Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3378304Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3378422Z return func(*args, **kwargs) 2022-11-23T03:12:18.3378792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3378884Z p_assert( 2022-11-23T03:12:18.3379199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3379316Z traceback.print_stack() 2022-11-23T03:12:18.3379703Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3380244Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3380665Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3380908Z File "", line 1, in 2022-11-23T03:12:18.3381112Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3381245Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3381448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3381580Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3381857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3381884Z self.run() 2022-11-23T03:12:18.3382077Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3382214Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3382543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3382668Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3383072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3383184Z getattr(self, test_name)() 2022-11-23T03:12:18.3383536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3383625Z fn() 2022-11-23T03:12:18.3384164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3384283Z test(self, **param_kwargs) 2022-11-23T03:12:18.3384632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3384905Z return func(*args, **kwargs) 2022-11-23T03:12:18.3385124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3385222Z self.run_subtests( 2022-11-23T03:12:18.3385636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3385784Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3386124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3386263Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3386612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3386720Z output = model(*input) 2022-11-23T03:12:18.3387017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3387146Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3387499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3387667Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3388013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3388124Z _lazy_init(state, module) 2022-11-23T03:12:18.3388452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3388578Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3388888Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3388999Z return func(*args, **kwargs) 2022-11-23T03:12:18.3389356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3389449Z p_assert( 2022-11-23T03:12:18.3389763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3389879Z traceback.print_stack() 2022-11-23T03:12:18.3390175Z File "", line 1, in 2022-11-23T03:12:18.3390375Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3390499Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3390692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3390834Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3391035Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3391129Z self.run() 2022-11-23T03:12:18.3391321Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3391456Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3391779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3391911Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3392324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3392446Z getattr(self, test_name)() 2022-11-23T03:12:18.3392796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3392888Z fn() 2022-11-23T03:12:18.3393237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3393350Z test(self, **param_kwargs) 2022-11-23T03:12:18.3393685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3393801Z return func(*args, **kwargs) 2022-11-23T03:12:18.3394038Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3394198Z self.run_subtests( 2022-11-23T03:12:18.3394545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3394704Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3395052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3395231Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3395546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3395760Z output = model(*input) 2022-11-23T03:12:18.3395971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3396110Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3396473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3396644Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3397001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3397111Z _lazy_init(state, module) 2022-11-23T03:12:18.3397440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3397576Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3397903Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3398016Z return func(*args, **kwargs) 2022-11-23T03:12:18.3398389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3398481Z p_assert( 2022-11-23T03:12:18.3398807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3399016Z traceback.print_stack() 2022-11-23T03:12:18.3399047Z File "", line 1, in 2022-11-23T03:12:18.3399245Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3399469Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3399569Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3399711Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3399913Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3400010Z self.run() 2022-11-23T03:12:18.3400207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3400337Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3400673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3400799Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3401201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3401323Z getattr(self, test_name)() 2022-11-23T03:12:18.3401672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3401758Z fn() 2022-11-23T03:12:18.3402110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3402215Z test(self, **param_kwargs) 2022-11-23T03:12:18.3402568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3402674Z return func(*args, **kwargs) 2022-11-23T03:12:18.3402910Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3403213Z self.run_subtests( 2022-11-23T03:12:18.3403549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3403699Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3404036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3404165Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3404515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3404620Z output = model(*input) 2022-11-23T03:12:18.3404921Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3405049Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3405406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3405580Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3405924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3406024Z _lazy_init(state, module) 2022-11-23T03:12:18.3406352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3406486Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3406800Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3406910Z return func(*args, **kwargs) 2022-11-23T03:12:18.3407261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3407351Z p_assert( 2022-11-23T03:12:18.3407663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3407774Z traceback.print_stack() 2022-11-23T03:12:18.3408001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.3408221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.3408433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.3408654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.3409028Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3409143Z File "", line 1, in 2022-11-23T03:12:18.3409336Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3409621Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3409864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3410012Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3410227Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3410315Z self.run() 2022-11-23T03:12:18.3410510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3410645Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3410967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3411096Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3411443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3411556Z getattr(self, test_name)() 2022-11-23T03:12:18.3411903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3412039Z fn() 2022-11-23T03:12:18.3412398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3412511Z test(self, **param_kwargs) 2022-11-23T03:12:18.3412848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3412963Z return func(*args, **kwargs) 2022-11-23T03:12:18.3413197Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3413302Z self.run_subtests( 2022-11-23T03:12:18.3413642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3413791Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3414140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3414290Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3414646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3414763Z output = model(*input) 2022-11-23T03:12:18.3415079Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3415228Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3415727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3415885Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3416228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3416334Z _lazy_init(state, module) 2022-11-23T03:12:18.3416677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3416798Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3417109Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3417217Z return func(*args, **kwargs) 2022-11-23T03:12:18.3417575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3417663Z p_assert( 2022-11-23T03:12:18.3417976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3418258Z traceback.print_stack() 2022-11-23T03:12:18.3418636Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3419031Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3419473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3419602Z File "", line 1, in 2022-11-23T03:12:18.3419800Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3420029Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3420127Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3420267Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3420473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3420559Z self.run() 2022-11-23T03:12:18.3420754Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3420892Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3421430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3421550Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3421887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3421997Z getattr(self, test_name)() 2022-11-23T03:12:18.3422326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3422416Z fn() 2022-11-23T03:12:18.3422757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3422870Z test(self, **param_kwargs) 2022-11-23T03:12:18.3423208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3423317Z return func(*args, **kwargs) 2022-11-23T03:12:18.3423552Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3423653Z self.run_subtests( 2022-11-23T03:12:18.3424353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3424513Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3424870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3425013Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3425377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3425490Z output = model(*input) 2022-11-23T03:12:18.3425807Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3425937Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3426300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3426468Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3426823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3426934Z _lazy_init(state, module) 2022-11-23T03:12:18.3427420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3427551Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3427868Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3427978Z return func(*args, **kwargs) 2022-11-23T03:12:18.3428506Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3428606Z p_assert( 2022-11-23T03:12:18.3429096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3429133Z traceback.print_stack() 2022-11-23T03:12:18.3429253Z File "", line 1, in 2022-11-23T03:12:18.3429452Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3429589Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3429782Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3429915Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3430115Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3430208Z self.run() 2022-11-23T03:12:18.3430404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3430541Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3430949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3431071Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3431414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3431527Z getattr(self, test_name)() 2022-11-23T03:12:18.3431876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3431965Z fn() 2022-11-23T03:12:18.3432319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3432431Z test(self, **param_kwargs) 2022-11-23T03:12:18.3432772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3432886Z return func(*args, **kwargs) 2022-11-23T03:12:18.3433122Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3433233Z self.run_subtests( 2022-11-23T03:12:18.3433729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3433876Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3434215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3434354Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3434701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3434805Z output = model(*input) 2022-11-23T03:12:18.3435100Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3435232Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3435591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3435751Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3436095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3436280Z _lazy_init(state, module) 2022-11-23T03:12:18.3436529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3436656Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3436969Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3437071Z return func(*args, **kwargs) 2022-11-23T03:12:18.3437424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3437530Z p_assert( 2022-11-23T03:12:18.3437916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3438049Z traceback.print_stack() 2022-11-23T03:12:18.3438347Z File "", line 1, in 2022-11-23T03:12:18.3438560Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3438683Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3438886Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3439043Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3439257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3439363Z self.run() 2022-11-23T03:12:18.3439568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3439717Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3440119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3440235Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3440599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3447552Z getattr(self, test_name)() 2022-11-23T03:12:18.3448168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3448249Z fn() 2022-11-23T03:12:18.3448606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3448718Z test(self, **param_kwargs) 2022-11-23T03:12:18.3449056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3449166Z return func(*args, **kwargs) 2022-11-23T03:12:18.3449408Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3449507Z self.run_subtests( 2022-11-23T03:12:18.3449835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3449974Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3450314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3450453Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3450804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3450911Z output = model(*input) 2022-11-23T03:12:18.3451213Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3451342Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3451701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3451854Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3452197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3452316Z _lazy_init(state, module) 2022-11-23T03:12:18.3452690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3452768Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3453081Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3453190Z return func(*args, **kwargs) 2022-11-23T03:12:18.3453642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3453703Z p_assert( 2022-11-23T03:12:18.3454040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3454164Z traceback.print_stack() 2022-11-23T03:12:18.3454393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.3454611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.3454826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.3455133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.3455418Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.3455527Z File "", line 1, in 2022-11-23T03:12:18.3455780Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3456079Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3456271Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3456413Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3456612Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3456704Z self.run() 2022-11-23T03:12:18.3456893Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3457022Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3457457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3457524Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3457828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3457949Z getattr(self, test_name)() 2022-11-23T03:12:18.3458298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3458386Z fn() 2022-11-23T03:12:18.3458742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3458846Z test(self, **param_kwargs) 2022-11-23T03:12:18.3459191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3459303Z return func(*args, **kwargs) 2022-11-23T03:12:18.3459541Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3459643Z self.run_subtests( 2022-11-23T03:12:18.3460018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3460140Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3460645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3460778Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3461127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3461231Z output = model(*input) 2022-11-23T03:12:18.3461638Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3461733Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3462016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3462175Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3462574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3462678Z _lazy_init(state, module) 2022-11-23T03:12:18.3463012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3463157Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3463668Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3463763Z return func(*args, **kwargs) 2022-11-23T03:12:18.3464431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3464575Z p_assert( 2022-11-23T03:12:18.3464858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3464967Z traceback.print_stack() 2022-11-23T03:12:18.3465351Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.3465853Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.3466240Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.3466361Z File "", line 1, in 2022-11-23T03:12:18.3466556Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3466687Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3466880Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3467011Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3467213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3467304Z self.run() 2022-11-23T03:12:18.3467656Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3467791Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3468108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3468225Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3468835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3469064Z getattr(self, test_name)() 2022-11-23T03:12:18.3469316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3469402Z fn() 2022-11-23T03:12:18.3469775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3469871Z test(self, **param_kwargs) 2022-11-23T03:12:18.3470214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3470335Z return func(*args, **kwargs) 2022-11-23T03:12:18.3470569Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3470664Z self.run_subtests( 2022-11-23T03:12:18.3471002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3471151Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3471500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3471641Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3472001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3472110Z output = model(*input) 2022-11-23T03:12:18.3472499Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3472632Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3473002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3473166Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3473519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3473632Z _lazy_init(state, module) 2022-11-23T03:12:18.3473971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3474104Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3474586Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3474741Z return func(*args, **kwargs) 2022-11-23T03:12:18.3475102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3475191Z p_assert( 2022-11-23T03:12:18.3475505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3475614Z traceback.print_stack() 2022-11-23T03:12:18.3475727Z File "", line 1, in 2022-11-23T03:12:18.3475917Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3476300Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3476401Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3476540Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3476850Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3476850Z self.run() 2022-11-23T03:12:18.3477111Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3477186Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3477521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3477635Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3477987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3478099Z getattr(self, test_name)() 2022-11-23T03:12:18.3478444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3478531Z fn() 2022-11-23T03:12:18.3478882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3478992Z test(self, **param_kwargs) 2022-11-23T03:12:18.3479341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3479447Z return func(*args, **kwargs) 2022-11-23T03:12:18.3479681Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3479783Z self.run_subtests( 2022-11-23T03:12:18.3480119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3480270Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3480615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3480754Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3480870Z File "", line 1, in 2022-11-23T03:12:18.3481226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3481341Z output = model(*input) 2022-11-23T03:12:18.3481743Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3481885Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3482078Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3482207Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3482570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3482734Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3482918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3483056Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3483451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3483568Z _lazy_init(state, module) 2022-11-23T03:12:18.3483775Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3483867Z self.run() 2022-11-23T03:12:18.3484212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3484337Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3484528Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3484663Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3485140Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3485248Z return func(*args, **kwargs) 2022-11-23T03:12:18.3485741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3485865Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3486241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3486327Z p_assert( 2022-11-23T03:12:18.3486675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3486787Z getattr(self, test_name)() 2022-11-23T03:12:18.3487108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3487223Z traceback.print_stack() 2022-11-23T03:12:18.3487669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3487756Z fn() 2022-11-23T03:12:18.3488123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3488296Z test(self, **param_kwargs) 2022-11-23T03:12:18.3488643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3488690Z return func(*args, **kwargs) 2022-11-23T03:12:18.3489078Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3489180Z self.run_subtests( 2022-11-23T03:12:18.3489678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3489830Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3490176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3490311Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3490672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3490786Z output = model(*input) 2022-11-23T03:12:18.3491146Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3491282Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3491646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3491814Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3492170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3492273Z _lazy_init(state, module) 2022-11-23T03:12:18.3492611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3492742Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3493069Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3493232Z return func(*args, **kwargs) 2022-11-23T03:12:18.3493606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3493699Z p_assert( 2022-11-23T03:12:18.3494020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3494128Z traceback.print_stack() 2022-11-23T03:12:18.3494364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.3494590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.3494809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.3495026Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.3495415Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.3495813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.3496197Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.3496316Z File "", line 1, in 2022-11-23T03:12:18.3496510Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3496800Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3497159Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3497304Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3497506Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3497599Z self.run() 2022-11-23T03:12:18.3497795Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3497926Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3498256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3498378Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3498728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3498840Z getattr(self, test_name)() 2022-11-23T03:12:18.3499185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3499270Z fn() 2022-11-23T03:12:18.3499621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3499784Z test(self, **param_kwargs) 2022-11-23T03:12:18.3500070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3500232Z return func(*args, **kwargs) 2022-11-23T03:12:18.3500474Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3500575Z self.run_subtests( 2022-11-23T03:12:18.3500914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3501063Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3501410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3501543Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3501906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3502012Z output = model(*input) 2022-11-23T03:12:18.3502383Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3502512Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3503025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3503184Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3503527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3503624Z _lazy_init(state, module) 2022-11-23T03:12:18.3504338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3504477Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3504810Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3504929Z return func(*args, **kwargs) 2022-11-23T03:12:18.3505296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3505386Z p_assert( 2022-11-23T03:12:18.3505709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3505816Z traceback.print_stack() 2022-11-23T03:12:18.3505932Z File "", line 1, in 2022-11-23T03:12:18.3506130Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3506260Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3506450Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3506591Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3506790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3506886Z self.run() 2022-11-23T03:12:18.3507234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3507365Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3507683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3507801Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3508136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3508300Z getattr(self, test_name)() 2022-11-23T03:12:18.3508576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3508653Z fn() 2022-11-23T03:12:18.3508994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3509103Z test(self, **param_kwargs) 2022-11-23T03:12:18.3509526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3509646Z return func(*args, **kwargs) 2022-11-23T03:12:18.3509871Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3509969Z self.run_subtests( 2022-11-23T03:12:18.3510299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3510538Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3510946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3511089Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3511449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3511633Z output = model(*input) 2022-11-23T03:12:18.3511954Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3512085Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3512450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3512616Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3512963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3513073Z _lazy_init(state, module) 2022-11-23T03:12:18.3513482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3513544Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3513867Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3513987Z return func(*args, **kwargs) 2022-11-23T03:12:18.3514354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3514445Z p_assert( 2022-11-23T03:12:18.3514762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3514877Z traceback.print_stack() 2022-11-23T03:12:18.3514996Z File "", line 1, in 2022-11-23T03:12:18.3515194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3515372Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3515614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3515815Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3516000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3516093Z self.run() 2022-11-23T03:12:18.3516280Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3516410Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3516726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3516843Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3517182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3517290Z getattr(self, test_name)() 2022-11-23T03:12:18.3517617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3517700Z fn() 2022-11-23T03:12:18.3518039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3518146Z test(self, **param_kwargs) 2022-11-23T03:12:18.3518708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3518828Z return func(*args, **kwargs) 2022-11-23T03:12:18.3519060Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3519162Z self.run_subtests( 2022-11-23T03:12:18.3519496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3519647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3519997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3520139Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3520501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3520656Z output = model(*input) 2022-11-23T03:12:18.3520982Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3521112Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3521631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3521788Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3522129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3522238Z _lazy_init(state, module) 2022-11-23T03:12:18.3522564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3522694Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3523006Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3523121Z return func(*args, **kwargs) 2022-11-23T03:12:18.3523471Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3523560Z p_assert( 2022-11-23T03:12:18.3523873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3523984Z traceback.print_stack() 2022-11-23T03:12:18.3524530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.3524650Z File "", line 1, in 2022-11-23T03:12:18.3524854Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3524986Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3525169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3525313Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3525518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3525612Z self.run() 2022-11-23T03:12:18.3525801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3525935Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3526260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3526377Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3526727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3526842Z getattr(self, test_name)() 2022-11-23T03:12:18.3527189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3527281Z fn() 2022-11-23T03:12:18.3527787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3527953Z test(self, **param_kwargs) 2022-11-23T03:12:18.3528299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3528400Z return func(*args, **kwargs) 2022-11-23T03:12:18.3528629Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3528726Z self.run_subtests( 2022-11-23T03:12:18.3529055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3529200Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3529534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3529671Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3530081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3530178Z output = model(*input) 2022-11-23T03:12:18.3530722Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3530789Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3531154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3531319Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3531668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3531776Z _lazy_init(state, module) 2022-11-23T03:12:18.3532112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3532241Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3532568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3532683Z return func(*args, **kwargs) 2022-11-23T03:12:18.3533047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3533137Z p_assert( 2022-11-23T03:12:18.3533459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3533573Z traceback.print_stack() 2022-11-23T03:12:18.3533812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.3534032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.3534251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.3534474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.3535013Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.3535388Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.3535758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.3536129Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.3536344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.3536554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.3536807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.3537184Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.3537551Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.3537771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.3538134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.3538671Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.3538892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.3539158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.3539377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.3539757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.3540129Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.3540356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.3540732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.3541108Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.3541330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.3541769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.3541978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.3542342Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3542564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.3542922Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3543283Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3543931Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.3544323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.3544483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.3544795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.3545073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3545447Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3545675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.3546047Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3546492Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.3547251Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3547983Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3548715Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3549655Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3550355Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3551054Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3551754Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3552449Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3553146Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3553837Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3554063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.3554279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.3554667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.3555054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3555332Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.3555727Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3556108Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3556483Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.3556709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.3556929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.3557146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.3557732Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3557953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.3558319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3558683Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3559049Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.3559256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.3559514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.3559683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.3560051Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3560509Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3560739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.3561107Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3561371Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.3561587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.3561797Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.3562006Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.3562374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3562594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.3562957Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3563319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3563857Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.3564080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.3564367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.3564594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.3564966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3565192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.3565567Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3565940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3566315Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.3566745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.3566957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.3567166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.3567529Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3567741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.3568103Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3568467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3568845Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.3569270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.3569491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.3569708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.3570086Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3570310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.3570687Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3571056Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3571433Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.3571655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.3571875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.3572092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.3572468Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.3572694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.3573066Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.3573505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.3573882Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.3574103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.3574319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.3574570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.3575070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.3575289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.3575704Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.3576065Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.3576598Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.3576818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.3577030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.3577247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.3577626Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.3577859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.3578308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.3578645Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.3578987Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.3579211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.3579429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.3579639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.3580023Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.3580251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.3580778Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.3581137Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.3581670Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.3581951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.3582119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.3582337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.3582757Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.3582989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.3583368Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.3583836Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.3584311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.3585120Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3585862Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3586591Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3587524Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3588282Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3588979Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3589678Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3590550Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3591273Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.3591512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.3591736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.3592026Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.3592521Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.3592646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.3593025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.3593407Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.3593786Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.3594337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.3594614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.3594836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.3595216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.3595444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.3595815Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.3596193Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.3596567Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.3596798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.3597170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.3597600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.3597936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.3598161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.3598537Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.3598914Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.3599291Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.3599511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.3599772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.3599948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.3600324Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.3600697Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.3600925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.3601302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.3601724Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.3602030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.3602165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.3602381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.3602760Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.3602985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.3603361Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.3603798Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.3604170Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.3604545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.3604749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.3604958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.3605320Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.3605536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.3605906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.3606270Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.3606632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.3606846Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.3607058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.3607267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.3607622Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.3607842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.3608307Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.3608566Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.3608924Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.3609023Z dist init r=1, world=4 2022-11-23T03:12:18.3609331Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3609625Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3609959Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3610252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3610525Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3610871Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3611252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3611546Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3611881Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3612168Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3612454Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3612747Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.3612846Z dist init r=0, world=4 2022-11-23T03:12:18.3613157Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3613466Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3613755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3614049Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3614343Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3614635Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3614930Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3615219Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3615509Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3615798Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3616242Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3616567Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.3616669Z dist init r=3, world=4 2022-11-23T03:12:18.3616962Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3617254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3617540Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3618001Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3618382Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3618672Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3618958Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3619248Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3619536Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3619824Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3620120Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3620469Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.3620566Z dist init r=2, world=4 2022-11-23T03:12:18.3620878Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3621177Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3621472Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3621775Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3622220Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3622499Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3622778Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3623054Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3623383Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3623674Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3624308Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3624594Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.3624686Z ok (7.927s) 2022-11-23T03:12:18.3625024Z test_mixture_of_experts_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16072 2022-11-23T03:12:18.3625312Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16073 2022-11-23T03:12:18.3625523Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16074 2022-11-23T03:12:18.3625728Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16075 2022-11-23T03:12:18.3626106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3626271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3626636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3626816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3627176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3627339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3627711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3628042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3628385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3628541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3628893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3629057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3629399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.3629556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.3629913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.3630083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.3630309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.3630532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.3631073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.3631213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.3631594Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3631976Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3632426Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3632812Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.3633031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.3633244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.3633457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.3633668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.3634805Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3634956Z warnings.warn( 2022-11-23T03:12:18.3635174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.3636127Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3636223Z warnings.warn( 2022-11-23T03:12:18.3637192Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3637324Z warnings.warn( 2022-11-23T03:12:18.3638249Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.3638348Z warnings.warn( 2022-11-23T03:12:18.3638575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.3638796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.3639190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.3639573Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3639953Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3640075Z File "", line 1, in 2022-11-23T03:12:18.3640270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3640404Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3640597Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3640785Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3640995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3641088Z self.run() 2022-11-23T03:12:18.3641279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3641461Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3641799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3642083Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3642420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3642611Z getattr(self, test_name)() 2022-11-23T03:12:18.3642865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3642998Z fn() 2022-11-23T03:12:18.3643345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3643448Z test(self, **param_kwargs) 2022-11-23T03:12:18.3643780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3644066Z return func(*args, **kwargs) 2022-11-23T03:12:18.3644312Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3644404Z self.run_subtests( 2022-11-23T03:12:18.3644798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3644896Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3645245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3645384Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3645750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3645861Z output = model(*input) 2022-11-23T03:12:18.3646176Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3646306Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3646673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3646837Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3647189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3647292Z _lazy_init(state, module) 2022-11-23T03:12:18.3647630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3647768Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3648248Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3648358Z return func(*args, **kwargs) 2022-11-23T03:12:18.3648713Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3648802Z p_assert( 2022-11-23T03:12:18.3649113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3649218Z traceback.print_stack() 2022-11-23T03:12:18.3649588Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3649957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.3650076Z File "", line 1, in 2022-11-23T03:12:18.3650315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3650448Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3650632Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3650765Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3650952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3651040Z self.run() 2022-11-23T03:12:18.3651228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3651360Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3651680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3651796Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3652191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3652301Z getattr(self, test_name)() 2022-11-23T03:12:18.3652628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3652710Z fn() 2022-11-23T03:12:18.3653049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3653159Z test(self, **param_kwargs) 2022-11-23T03:12:18.3653492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3653602Z return func(*args, **kwargs) 2022-11-23T03:12:18.3653828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3653919Z self.run_subtests( 2022-11-23T03:12:18.3654248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3654400Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3654738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3654875Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3655223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3655328Z output = model(*input) 2022-11-23T03:12:18.3655630Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3655757Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3656101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3656260Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3656609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3656715Z _lazy_init(state, module) 2022-11-23T03:12:18.3657040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3657167Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3657481Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3657591Z return func(*args, **kwargs) 2022-11-23T03:12:18.3657937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3658028Z p_assert( 2022-11-23T03:12:18.3658341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3658459Z traceback.print_stack() 2022-11-23T03:12:18.3658575Z File "", line 1, in 2022-11-23T03:12:18.3658736Z File "", line 1, in 2022-11-23T03:12:18.3658942Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3659062Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3659247Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3659381Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3659571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3659695Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3659891Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3659979Z self.run() 2022-11-23T03:12:18.3660162Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3660338Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3660524Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3660655Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3660849Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3660939Z self.run() 2022-11-23T03:12:18.3661257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3661540Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3661722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3661854Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3662204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3662316Z getattr(self, test_name)() 2022-11-23T03:12:18.3662642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3662771Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3663151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3663206Z fn() 2022-11-23T03:12:18.3663545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3663658Z getattr(self, test_name)() 2022-11-23T03:12:18.3664221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3664403Z test(self, **param_kwargs) 2022-11-23T03:12:18.3664693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3664778Z fn() 2022-11-23T03:12:18.3665122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3665277Z return func(*args, **kwargs) 2022-11-23T03:12:18.3665589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3665701Z test(self, **param_kwargs) 2022-11-23T03:12:18.3665935Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3666035Z self.run_subtests( 2022-11-23T03:12:18.3666379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3666492Z return func(*args, **kwargs) 2022-11-23T03:12:18.3666835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3666986Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3667212Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3667390Z self.run_subtests( 2022-11-23T03:12:18.3667755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3667898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3668235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3668384Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3668744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3668852Z output = model(*input) 2022-11-23T03:12:18.3669253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3669395Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3669784Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3670002Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3670279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3670388Z output = model(*input) 2022-11-23T03:12:18.3670748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3670914Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3671221Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3671350Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3671703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3671819Z _lazy_init(state, module) 2022-11-23T03:12:18.3672184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3672348Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3672688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3672821Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3673164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3673273Z _lazy_init(state, module) 2022-11-23T03:12:18.3673617Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3673719Z return func(*args, **kwargs) 2022-11-23T03:12:18.3674102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3674194Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3674559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3674651Z p_assert( 2022-11-23T03:12:18.3674969Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3675083Z return func(*args, **kwargs) 2022-11-23T03:12:18.3675565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3675676Z traceback.print_stack() 2022-11-23T03:12:18.3676029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3676116Z p_assert( 2022-11-23T03:12:18.3676423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3676534Z traceback.print_stack() 2022-11-23T03:12:18.3677002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.3677209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.3677437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.3677664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.3678053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3678436Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3678642Z File "", line 1, in 2022-11-23T03:12:18.3678756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3678958Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3679129Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3679269Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3679469Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3679561Z self.run() 2022-11-23T03:12:18.3679751Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3679884Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3680217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3680331Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3680679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3680951Z getattr(self, test_name)() 2022-11-23T03:12:18.3681294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3681379Z fn() 2022-11-23T03:12:18.3681718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3681824Z test(self, **param_kwargs) 2022-11-23T03:12:18.3682346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3682453Z return func(*args, **kwargs) 2022-11-23T03:12:18.3682689Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3682791Z self.run_subtests( 2022-11-23T03:12:18.3683130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3683279Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3683634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3683779Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3684225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3684244Z output = model(*input) 2022-11-23T03:12:18.3684560Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3684691Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3685056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3685219Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3685724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3685834Z _lazy_init(state, module) 2022-11-23T03:12:18.3686205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3686332Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3686647Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3686755Z return func(*args, **kwargs) 2022-11-23T03:12:18.3687108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3687196Z p_assert( 2022-11-23T03:12:18.3687507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3687617Z traceback.print_stack() 2022-11-23T03:12:18.3687729Z File "", line 1, in 2022-11-23T03:12:18.3687912Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3688092Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3688278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3688411Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3688606Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3688695Z self.run() 2022-11-23T03:12:18.3688879Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3689002Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3689326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3689448Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3689783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3689897Z getattr(self, test_name)() 2022-11-23T03:12:18.3690232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3690315Z fn() 2022-11-23T03:12:18.3690653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3690929Z test(self, **param_kwargs) 2022-11-23T03:12:18.3691274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3691387Z return func(*args, **kwargs) 2022-11-23T03:12:18.3691623Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3691726Z self.run_subtests( 2022-11-23T03:12:18.3692065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3692215Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3692571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3692706Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3693070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3693178Z output = model(*input) 2022-11-23T03:12:18.3693492Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3693621Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3693983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3694146Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3694499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3694669Z _lazy_init(state, module) 2022-11-23T03:12:18.3695022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3695152Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3695475Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3695610Z return func(*args, **kwargs) 2022-11-23T03:12:18.3695957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3696048Z p_assert( 2022-11-23T03:12:18.3696374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3696480Z traceback.print_stack() 2022-11-23T03:12:18.3696874Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3697307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.3697579Z File "", line 1, in 2022-11-23T03:12:18.3697946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3698078Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3698268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3698410Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3698602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3698695Z self.run() 2022-11-23T03:12:18.3698884Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3699018Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3699347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3699474Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3699826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3699937Z getattr(self, test_name)() 2022-11-23T03:12:18.3700278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3700364Z fn() 2022-11-23T03:12:18.3700716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3700829Z test(self, **param_kwargs) 2022-11-23T03:12:18.3701173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3701286Z return func(*args, **kwargs) 2022-11-23T03:12:18.3701521Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3701630Z self.run_subtests( 2022-11-23T03:12:18.3701962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3702114Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3702460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3702600Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3702961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3703070Z output = model(*input) 2022-11-23T03:12:18.3703617Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3703662Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3704318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3704498Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3704850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3704956Z _lazy_init(state, module) 2022-11-23T03:12:18.3705282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3705409Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3705723Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3705831Z return func(*args, **kwargs) 2022-11-23T03:12:18.3706178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3706372Z p_assert( 2022-11-23T03:12:18.3706693Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3706803Z traceback.print_stack() 2022-11-23T03:12:18.3706915Z File "", line 1, in 2022-11-23T03:12:18.3707107Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3707234Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3707412Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3707549Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3707744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3707833Z self.run() 2022-11-23T03:12:18.3708017Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3708147Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3708471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3708591Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3708924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3709033Z getattr(self, test_name)() 2022-11-23T03:12:18.3709367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3709449Z fn() 2022-11-23T03:12:18.3709953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3710065Z test(self, **param_kwargs) 2022-11-23T03:12:18.3710408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3710522Z return func(*args, **kwargs) 2022-11-23T03:12:18.3710747Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3710857Z self.run_subtests( 2022-11-23T03:12:18.3711221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3711349Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3711698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3711838Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3712197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3712304Z output = model(*input) 2022-11-23T03:12:18.3712611Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3712742Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3713158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3713333Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3713689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3713799Z _lazy_init(state, module) 2022-11-23T03:12:18.3714136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3714267Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3714584Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3714698Z return func(*args, **kwargs) 2022-11-23T03:12:18.3715060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3715199Z p_assert( 2022-11-23T03:12:18.3715526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3715640Z traceback.print_stack() 2022-11-23T03:12:18.3715873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.3716132Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.3716490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.3716713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.3717086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3717459Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3717833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3717947Z File "", line 1, in 2022-11-23T03:12:18.3718141Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3718268Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3718453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3718582Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3718777Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3718867Z self.run() 2022-11-23T03:12:18.3719051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3719354Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3719684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3719812Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3720156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3720268Z getattr(self, test_name)() 2022-11-23T03:12:18.3720613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3720700Z fn() 2022-11-23T03:12:18.3721051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3721203Z test(self, **param_kwargs) 2022-11-23T03:12:18.3721507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3721621Z return func(*args, **kwargs) 2022-11-23T03:12:18.3721848Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3722005Z self.run_subtests( 2022-11-23T03:12:18.3722513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3722659Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3722991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3723126Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3723241Z File "", line 1, in 2022-11-23T03:12:18.3723590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3723687Z output = model(*input) 2022-11-23T03:12:18.3723991Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3724114Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3724358Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3724483Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3724834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3724992Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3725354Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3725487Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3725844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3725952Z _lazy_init(state, module) 2022-11-23T03:12:18.3726151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3726243Z self.run() 2022-11-23T03:12:18.3726589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3726721Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3726910Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3727037Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3727364Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3727477Z return func(*args, **kwargs) 2022-11-23T03:12:18.3727804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3727925Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3728441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3728704Z p_assert( 2022-11-23T03:12:18.3729062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3729169Z getattr(self, test_name)() 2022-11-23T03:12:18.3729492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3729618Z traceback.print_stack() 2022-11-23T03:12:18.3729968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3730053Z fn() 2022-11-23T03:12:18.3730402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3730515Z test(self, **param_kwargs) 2022-11-23T03:12:18.3730850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3730963Z return func(*args, **kwargs) 2022-11-23T03:12:18.3731197Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3731395Z self.run_subtests( 2022-11-23T03:12:18.3731478Z File "", line 1, in 2022-11-23T03:12:18.3731820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3731971Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3732323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3732457Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3732653Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3732782Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3733144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3733312Z output = model(*input) 2022-11-23T03:12:18.3733506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3733646Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3733963Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3734085Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3734433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3734529Z self.run() 2022-11-23T03:12:18.3734879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3735032Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3735219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3735345Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3735693Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3735800Z _lazy_init(state, module) 2022-11-23T03:12:18.3736115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3736230Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3736649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3736682Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3737015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3737120Z getattr(self, test_name)() 2022-11-23T03:12:18.3737433Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3737543Z return func(*args, **kwargs) 2022-11-23T03:12:18.3737881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3737963Z fn() 2022-11-23T03:12:18.3738308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3738396Z p_assert( 2022-11-23T03:12:18.3738734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3738845Z test(self, **param_kwargs) 2022-11-23T03:12:18.3739155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3739433Z traceback.print_stack() 2022-11-23T03:12:18.3739779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3739892Z return func(*args, **kwargs) 2022-11-23T03:12:18.3740120Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3740281Z self.run_subtests( 2022-11-23T03:12:18.3740638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3740787Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3741138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3741280Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3741700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3741809Z output = model(*input) 2022-11-23T03:12:18.3742113Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3742335Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3742664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3742828Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3743185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3743295Z _lazy_init(state, module) 2022-11-23T03:12:18.3743631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3743763Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3744268Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3744386Z return func(*args, **kwargs) 2022-11-23T03:12:18.3744756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3744851Z p_assert( 2022-11-23T03:12:18.3745176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3745290Z traceback.print_stack() 2022-11-23T03:12:18.3745678Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.3745796Z File "", line 1, in 2022-11-23T03:12:18.3745987Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3746116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3746307Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3746445Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3746644Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3746734Z self.run() 2022-11-23T03:12:18.3746924Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3747065Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3747387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3747588Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3747862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3747971Z getattr(self, test_name)() 2022-11-23T03:12:18.3748319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3748560Z fn() 2022-11-23T03:12:18.3748902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3749009Z test(self, **param_kwargs) 2022-11-23T03:12:18.3749333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3749519Z return func(*args, **kwargs) 2022-11-23T03:12:18.3749756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3749855Z self.run_subtests( 2022-11-23T03:12:18.3750182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3750326Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3750662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3750799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3751142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3751246Z output = model(*input) 2022-11-23T03:12:18.3751616Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3751747Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3752105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3752268Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3752605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3752711Z _lazy_init(state, module) 2022-11-23T03:12:18.3753031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3753157Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3753471Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3753580Z return func(*args, **kwargs) 2022-11-23T03:12:18.3753938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3754027Z p_assert( 2022-11-23T03:12:18.3754341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3754450Z traceback.print_stack() 2022-11-23T03:12:18.3754668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.3754894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.3755114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.3755334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.3755705Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3756080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3756443Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3756560Z File "", line 1, in 2022-11-23T03:12:18.3756747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3756872Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3757058Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3757190Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3757385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3757474Z self.run() 2022-11-23T03:12:18.3757659Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3757792Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3757995Z File "", line 1, in 2022-11-23T03:12:18.3758328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3758445Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3758778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3758887Z getattr(self, test_name)() 2022-11-23T03:12:18.3759076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3759202Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3759536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3759613Z fn() 2022-11-23T03:12:18.3759798Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3759982Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3760387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3760435Z test(self, **param_kwargs) 2022-11-23T03:12:18.3760629Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3760718Z self.run() 2022-11-23T03:12:18.3761044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3761155Z return func(*args, **kwargs) 2022-11-23T03:12:18.3761339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3761471Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3761697Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3761799Z self.run_subtests( 2022-11-23T03:12:18.3762116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3762234Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3762557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3762703Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3763040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3763165Z getattr(self, test_name)() 2022-11-23T03:12:18.3763493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3763620Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3763952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3764040Z fn() 2022-11-23T03:12:18.3764560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3764714Z output = model(*input) 2022-11-23T03:12:18.3765026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3765138Z test(self, **param_kwargs) 2022-11-23T03:12:18.3765450Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3765602Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3765928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3766041Z return func(*args, **kwargs) 2022-11-23T03:12:18.3766396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3766613Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3766856Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3766958Z self.run_subtests( 2022-11-23T03:12:18.3767312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3767422Z _lazy_init(state, module) 2022-11-23T03:12:18.3767913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3768057Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3768373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3768498Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3768837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3769081Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3769440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3769512Z return func(*args, **kwargs) 2022-11-23T03:12:18.3770080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3770151Z output = model(*input) 2022-11-23T03:12:18.3770510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3770602Z p_assert( 2022-11-23T03:12:18.3770914Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3771041Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3771364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3771488Z traceback.print_stack() 2022-11-23T03:12:18.3771851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3772013Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3772355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3772465Z _lazy_init(state, module) 2022-11-23T03:12:18.3772801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3772934Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3773256Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3773368Z return func(*args, **kwargs) 2022-11-23T03:12:18.3773737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3773828Z p_assert( 2022-11-23T03:12:18.3774149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3774266Z traceback.print_stack() 2022-11-23T03:12:18.3774385Z File "", line 1, in 2022-11-23T03:12:18.3774586Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3774717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3774905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3775042Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3775337Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3775351Z self.run() 2022-11-23T03:12:18.3775519Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3775865Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3776192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3776333Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3776699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3776754Z getattr(self, test_name)() 2022-11-23T03:12:18.3777083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3777393Z fn() 2022-11-23T03:12:18.3777688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3777798Z test(self, **param_kwargs) 2022-11-23T03:12:18.3778138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3778308Z return func(*args, **kwargs) 2022-11-23T03:12:18.3778542Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3778644Z self.run_subtests( 2022-11-23T03:12:18.3778997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3779127Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3779479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3779620Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3779979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3780087Z output = model(*input) 2022-11-23T03:12:18.3780398Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3780534Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3780888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3781054Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3781409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3781516Z _lazy_init(state, module) 2022-11-23T03:12:18.3781855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3781985Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3782309Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3782423Z return func(*args, **kwargs) 2022-11-23T03:12:18.3782791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3782882Z p_assert( 2022-11-23T03:12:18.3783204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3783318Z traceback.print_stack() 2022-11-23T03:12:18.3783705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.3783931Z File "", line 1, in 2022-11-23T03:12:18.3784221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3784353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3784622Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3784675Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3784879Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3785037Z self.run() 2022-11-23T03:12:18.3785238Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3785371Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3785861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3785977Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3786474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3786586Z getattr(self, test_name)() 2022-11-23T03:12:18.3786937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3787022Z fn() 2022-11-23T03:12:18.3787372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3787551Z test(self, **param_kwargs) 2022-11-23T03:12:18.3787900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3788014Z return func(*args, **kwargs) 2022-11-23T03:12:18.3788245Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3788348Z self.run_subtests( 2022-11-23T03:12:18.3788687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3788837Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3789340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3789477Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3789828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3790113Z output = model(*input) 2022-11-23T03:12:18.3790422Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3790552Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3790916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3791083Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3791434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3791544Z _lazy_init(state, module) 2022-11-23T03:12:18.3791881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3792014Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3792338Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3792452Z return func(*args, **kwargs) 2022-11-23T03:12:18.3792819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3792913Z p_assert( 2022-11-23T03:12:18.3793235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3793349Z traceback.print_stack() 2022-11-23T03:12:18.3793580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.3793812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.3794035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.3794263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.3794698Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3795093Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3795212Z File "", line 1, in 2022-11-23T03:12:18.3795412Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3795544Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3795737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3795869Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3796069Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3796162Z self.run() 2022-11-23T03:12:18.3796352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3796543Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3796873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3796994Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3797344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3797604Z getattr(self, test_name)() 2022-11-23T03:12:18.3797938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3798020Z fn() 2022-11-23T03:12:18.3798537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3798649Z test(self, **param_kwargs) 2022-11-23T03:12:18.3798993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3799110Z return func(*args, **kwargs) 2022-11-23T03:12:18.3799348Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3799444Z self.run_subtests( 2022-11-23T03:12:18.3799781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3799935Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3800281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3800420Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3800778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3800930Z output = model(*input) 2022-11-23T03:12:18.3801203Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3801334Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3801698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3801862Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3802216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3802327Z _lazy_init(state, module) 2022-11-23T03:12:18.3802677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3802796Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3803274Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3803377Z return func(*args, **kwargs) 2022-11-23T03:12:18.3803782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3803957Z p_assert( 2022-11-23T03:12:18.3804195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3804305Z traceback.print_stack() 2022-11-23T03:12:18.3804418Z File "", line 1, in 2022-11-23T03:12:18.3804608Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3804728Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3804912Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3805045Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3805236Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3805325Z self.run() 2022-11-23T03:12:18.3805510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3805691Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3806010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3806120Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3806456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3806563Z getattr(self, test_name)() 2022-11-23T03:12:18.3806895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3806978Z fn() 2022-11-23T03:12:18.3807318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3807425Z test(self, **param_kwargs) 2022-11-23T03:12:18.3807757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3807866Z return func(*args, **kwargs) 2022-11-23T03:12:18.3808091Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3808190Z self.run_subtests( 2022-11-23T03:12:18.3808517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3808661Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3809077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3809129Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3809479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3809577Z output = model(*input) 2022-11-23T03:12:18.3809879Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3810012Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3810364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3810524Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3810865Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3810970Z _lazy_init(state, module) 2022-11-23T03:12:18.3811297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3811416Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3811909Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3812022Z return func(*args, **kwargs) 2022-11-23T03:12:18.3812444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3812544Z p_assert( 2022-11-23T03:12:18.3812871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3812985Z traceback.print_stack() 2022-11-23T03:12:18.3813372Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3813749Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.3813868Z File "", line 1, in 2022-11-23T03:12:18.3814067Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3814198Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3814388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3814586Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3814788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3814880Z self.run() 2022-11-23T03:12:18.3815065Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3815201Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3815532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3815655Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3816004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3816114Z getattr(self, test_name)() 2022-11-23T03:12:18.3816464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3816648Z fn() 2022-11-23T03:12:18.3817055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3817163Z test(self, **param_kwargs) 2022-11-23T03:12:18.3817493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3817601Z return func(*args, **kwargs) 2022-11-23T03:12:18.3817826Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3817925Z self.run_subtests( 2022-11-23T03:12:18.3818249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3818394Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3818721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3818860Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3819210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3819315Z output = model(*input) 2022-11-23T03:12:18.3819790Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3819922Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3820282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3820446Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3820794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3820903Z _lazy_init(state, module) 2022-11-23T03:12:18.3821241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3821438Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3821777Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3821889Z return func(*args, **kwargs) 2022-11-23T03:12:18.3822254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3822346Z p_assert( 2022-11-23T03:12:18.3822815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3822924Z traceback.print_stack() 2022-11-23T03:12:18.3823037Z File "", line 1, in 2022-11-23T03:12:18.3823227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3823352Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3823534Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3823720Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3824078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3824177Z self.run() 2022-11-23T03:12:18.3824362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3824490Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3824813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3825103Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3825458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3825572Z getattr(self, test_name)() 2022-11-23T03:12:18.3825912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3826007Z fn() 2022-11-23T03:12:18.3826364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3826476Z test(self, **param_kwargs) 2022-11-23T03:12:18.3826821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3826934Z return func(*args, **kwargs) 2022-11-23T03:12:18.3827167Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3827267Z self.run_subtests( 2022-11-23T03:12:18.3827599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3827748Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3828093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3828236Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3828759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3828865Z output = model(*input) 2022-11-23T03:12:18.3829166Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3829289Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3829631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3829790Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3830131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3830238Z _lazy_init(state, module) 2022-11-23T03:12:18.3830598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3830765Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3831093Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3831201Z return func(*args, **kwargs) 2022-11-23T03:12:18.3831547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3831865Z p_assert( 2022-11-23T03:12:18.3832190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3832302Z traceback.print_stack() 2022-11-23T03:12:18.3832536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.3832768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.3833063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.3833294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.3833675Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3834060Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3834433Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3834804Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.3834921Z File "", line 1, in 2022-11-23T03:12:18.3835278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3835409Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3835594Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3835729Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3835916Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3836005Z self.run() 2022-11-23T03:12:18.3836189Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3836317Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3836632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3836749Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3837086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3837188Z getattr(self, test_name)() 2022-11-23T03:12:18.3837533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3837618Z fn() 2022-11-23T03:12:18.3837958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3838066Z test(self, **param_kwargs) 2022-11-23T03:12:18.3838177Z File "", line 1, in 2022-11-23T03:12:18.3838508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3838617Z return func(*args, **kwargs) 2022-11-23T03:12:18.3838836Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3838933Z self.run_subtests( 2022-11-23T03:12:18.3839124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3839250Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3839798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3839960Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3840155Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3840293Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3840641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3840781Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3840984Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3841077Z self.run() 2022-11-23T03:12:18.3841439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3841598Z output = model(*input) 2022-11-23T03:12:18.3841840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3841977Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3842291Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3842420Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3842901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3843018Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3843366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3843525Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3843858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3843964Z getattr(self, test_name)() 2022-11-23T03:12:18.3844483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3844594Z _lazy_init(state, module) 2022-11-23T03:12:18.3844966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3845030Z fn() 2022-11-23T03:12:18.3845427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3845499Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3845847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3845961Z test(self, **param_kwargs) 2022-11-23T03:12:18.3846276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3846388Z return func(*args, **kwargs) 2022-11-23T03:12:18.3846742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3846853Z return func(*args, **kwargs) 2022-11-23T03:12:18.3847215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3847308Z p_assert( 2022-11-23T03:12:18.3847540Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3847636Z self.run_subtests( 2022-11-23T03:12:18.3847960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3848073Z traceback.print_stack() 2022-11-23T03:12:18.3848410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3848560Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3848958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3849328Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3849614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3849711Z output = model(*input) 2022-11-23T03:12:18.3850017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3850141Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3850489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3850646Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3850985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3851139Z _lazy_init(state, module) 2022-11-23T03:12:18.3851254Z File "", line 1, in 2022-11-23T03:12:18.3851575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3851701Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3852012Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3852122Z return func(*args, **kwargs) 2022-11-23T03:12:18.3852315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3852441Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3852792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3852880Z p_assert( 2022-11-23T03:12:18.3853057Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3853199Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3853513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3853622Z traceback.print_stack() 2022-11-23T03:12:18.3853817Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3853906Z self.run() 2022-11-23T03:12:18.3854089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3854217Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3854523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3854639Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3855149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3855260Z getattr(self, test_name)() 2022-11-23T03:12:18.3855614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3855700Z fn() 2022-11-23T03:12:18.3856052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3856161Z test(self, **param_kwargs) 2022-11-23T03:12:18.3856496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3856609Z return func(*args, **kwargs) 2022-11-23T03:12:18.3856844Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3856948Z self.run_subtests( 2022-11-23T03:12:18.3857289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3857439Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3857994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3858136Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3858481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3858586Z output = model(*input) 2022-11-23T03:12:18.3858889Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3859016Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3859366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3859524Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3859863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3860017Z _lazy_init(state, module) 2022-11-23T03:12:18.3860342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3860472Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3860888Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3860899Z return func(*args, **kwargs) 2022-11-23T03:12:18.3861252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3861339Z p_assert( 2022-11-23T03:12:18.3861654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3861762Z traceback.print_stack() 2022-11-23T03:12:18.3861870Z File "", line 1, in 2022-11-23T03:12:18.3862061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3862197Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3862381Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3862513Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3862707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3862796Z self.run() 2022-11-23T03:12:18.3862972Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3863100Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3863416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3863532Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3864038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3864180Z getattr(self, test_name)() 2022-11-23T03:12:18.3864506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3864589Z fn() 2022-11-23T03:12:18.3865102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3865214Z test(self, **param_kwargs) 2022-11-23T03:12:18.3865562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3865675Z return func(*args, **kwargs) 2022-11-23T03:12:18.3865929Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3866011Z self.run_subtests( 2022-11-23T03:12:18.3866347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3866496Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3866913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3867066Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3867429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3867538Z output = model(*input) 2022-11-23T03:12:18.3868003Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3868127Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3868475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3868633Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3868971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3869174Z _lazy_init(state, module) 2022-11-23T03:12:18.3869512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3869639Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3869951Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3870235Z return func(*args, **kwargs) 2022-11-23T03:12:18.3870605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3870699Z p_assert( 2022-11-23T03:12:18.3871011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3871127Z traceback.print_stack() 2022-11-23T03:12:18.3871364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.3871601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.3871936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.3872059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.3872445Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3872827Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3873200Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3873314Z File "", line 1, in 2022-11-23T03:12:18.3873514Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3873650Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3873844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3873984Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3874183Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3874304Z self.run() 2022-11-23T03:12:18.3874459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3874593Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3874925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3875047Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3875395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3875516Z getattr(self, test_name)() 2022-11-23T03:12:18.3875905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3876157Z fn() 2022-11-23T03:12:18.3876493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3876601Z test(self, **param_kwargs) 2022-11-23T03:12:18.3876934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3877044Z return func(*args, **kwargs) 2022-11-23T03:12:18.3877269Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3877369Z self.run_subtests( 2022-11-23T03:12:18.3877865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3878016Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3878411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3878552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3878915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3879021Z output = model(*input) 2022-11-23T03:12:18.3879356Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3879464Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3879826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3879989Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3880333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3880449Z _lazy_init(state, module) 2022-11-23T03:12:18.3880792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3880926Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3881258Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3881369Z return func(*args, **kwargs) 2022-11-23T03:12:18.3881889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3881979Z p_assert( 2022-11-23T03:12:18.3882367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3882478Z traceback.print_stack() 2022-11-23T03:12:18.3882764Z File "", line 1, in 2022-11-23T03:12:18.3882963Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3883099Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3883292Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3883432Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3883631Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3883717Z self.run() 2022-11-23T03:12:18.3883906Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3884041Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3884365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3884486Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3884834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3885011Z getattr(self, test_name)() 2022-11-23T03:12:18.3885437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3885450Z fn() 2022-11-23T03:12:18.3885966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3886074Z test(self, **param_kwargs) 2022-11-23T03:12:18.3886404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3886514Z return func(*args, **kwargs) 2022-11-23T03:12:18.3886740Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3886839Z self.run_subtests( 2022-11-23T03:12:18.3886946Z File "", line 1, in 2022-11-23T03:12:18.3887273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3887467Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3887810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3887950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3888229Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3888359Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3888707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3888805Z output = model(*input) 2022-11-23T03:12:18.3888990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3889123Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3889425Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3889553Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3889750Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3889840Z self.run() 2022-11-23T03:12:18.3890192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3890343Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3890530Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3890659Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3891001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3891106Z _lazy_init(state, module) 2022-11-23T03:12:18.3891606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3891732Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3892074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3892201Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3892548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3892660Z getattr(self, test_name)() 2022-11-23T03:12:18.3892988Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3893105Z return func(*args, **kwargs) 2022-11-23T03:12:18.3893445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3893532Z fn() 2022-11-23T03:12:18.3893897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3893985Z p_assert( 2022-11-23T03:12:18.3894385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3894502Z test(self, **param_kwargs) 2022-11-23T03:12:18.3894826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3895097Z traceback.print_stack() 2022-11-23T03:12:18.3895613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3895725Z return func(*args, **kwargs) 2022-11-23T03:12:18.3895952Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3896055Z self.run_subtests( 2022-11-23T03:12:18.3896393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3896541Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3896987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3897128Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3897575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3897598Z output = model(*input) 2022-11-23T03:12:18.3897913Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3898033Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3898719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3898884Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3899241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3899355Z _lazy_init(state, module) 2022-11-23T03:12:18.3899695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3899826Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3900149Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3900255Z return func(*args, **kwargs) 2022-11-23T03:12:18.3900620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3900711Z p_assert( 2022-11-23T03:12:18.3901036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3901150Z traceback.print_stack() 2022-11-23T03:12:18.3901537Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.3901658Z File "", line 1, in 2022-11-23T03:12:18.3901852Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3901983Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3902173Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3902311Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3902512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3902603Z self.run() 2022-11-23T03:12:18.3902793Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3903005Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3903250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3903374Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3903774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3904074Z getattr(self, test_name)() 2022-11-23T03:12:18.3904441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3904527Z fn() 2022-11-23T03:12:18.3905022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3905128Z test(self, **param_kwargs) 2022-11-23T03:12:18.3905452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3905562Z return func(*args, **kwargs) 2022-11-23T03:12:18.3905788Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3905885Z self.run_subtests( 2022-11-23T03:12:18.3906369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3906445Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3906781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3906915Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3907255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3907365Z output = model(*input) 2022-11-23T03:12:18.3907665Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3907792Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3908140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3908303Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3908649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3908757Z _lazy_init(state, module) 2022-11-23T03:12:18.3909088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3909205Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3909516Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3909626Z return func(*args, **kwargs) 2022-11-23T03:12:18.3909980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3910067Z p_assert( 2022-11-23T03:12:18.3910380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3910492Z traceback.print_stack() 2022-11-23T03:12:18.3910714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.3910940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.3911161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.3911379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.3911752Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3912038Z File "", line 1, in 2022-11-23T03:12:18.3912238Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3912369Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3912552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3912770Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3912985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3913077Z self.run() 2022-11-23T03:12:18.3913268Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3913402Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3913730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3913851Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3914192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3914303Z getattr(self, test_name)() 2022-11-23T03:12:18.3914649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3914785Z fn() 2022-11-23T03:12:18.3915143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3915254Z test(self, **param_kwargs) 2022-11-23T03:12:18.3915597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3915711Z return func(*args, **kwargs) 2022-11-23T03:12:18.3915939Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3916042Z self.run_subtests( 2022-11-23T03:12:18.3916382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3916533Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3916880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3917044Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3917546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3917653Z output = model(*input) 2022-11-23T03:12:18.3917949Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3918075Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3918611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3918774Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3919132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3919241Z _lazy_init(state, module) 2022-11-23T03:12:18.3919579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3919718Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3920039Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3920242Z return func(*args, **kwargs) 2022-11-23T03:12:18.3920655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3920720Z p_assert( 2022-11-23T03:12:18.3920988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3921149Z traceback.print_stack() 2022-11-23T03:12:18.3921627Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3922377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3923105Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.3923299Z File "", line 1, in 2022-11-23T03:12:18.3923784Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3924116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3924422Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3924675Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3925039Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3925195Z self.run() 2022-11-23T03:12:18.3925535Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3925679Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3925884Z File "", line 1, in 2022-11-23T03:12:18.3926568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3926783Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3927423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3927614Z getattr(self, test_name)() 2022-11-23T03:12:18.3927958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3928189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3928841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3928986Z fn() 2022-11-23T03:12:18.3929324Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3929572Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3930294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3930511Z test(self, **param_kwargs) 2022-11-23T03:12:18.3930669Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3930865Z self.run() 2022-11-23T03:12:18.3931214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3931335Z return func(*args, **kwargs) 2022-11-23T03:12:18.3931525Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3931620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3931893Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3931974Z self.run_subtests( 2022-11-23T03:12:18.3932318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3932410Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3932779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3932976Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3933618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3933807Z getattr(self, test_name)() 2022-11-23T03:12:18.3934174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3934322Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3934656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3934734Z fn() 2022-11-23T03:12:18.3935010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3935124Z output = model(*input) 2022-11-23T03:12:18.3935632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3935679Z test(self, **param_kwargs) 2022-11-23T03:12:18.3935997Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3936129Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3936468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3936582Z return func(*args, **kwargs) 2022-11-23T03:12:18.3936945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3937109Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3937345Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3937505Z self.run_subtests( 2022-11-23T03:12:18.3937863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3937973Z _lazy_init(state, module) 2022-11-23T03:12:18.3938303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3938454Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3938944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3939071Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3939408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3939545Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3939866Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3939976Z return func(*args, **kwargs) 2022-11-23T03:12:18.3940503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3940610Z output = model(*input) 2022-11-23T03:12:18.3940976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3941068Z p_assert( 2022-11-23T03:12:18.3941382Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3941569Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3941897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3942011Z traceback.print_stack() 2022-11-23T03:12:18.3942373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3942537Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3942940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3943096Z _lazy_init(state, module) 2022-11-23T03:12:18.3943488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3943615Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3944496Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3944624Z return func(*args, **kwargs) 2022-11-23T03:12:18.3944743Z File "", line 1, in 2022-11-23T03:12:18.3945092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3945203Z p_assert( 2022-11-23T03:12:18.3945490Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3945613Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3945968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3946078Z traceback.print_stack() 2022-11-23T03:12:18.3946260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3946377Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3946618Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3946710Z self.run() 2022-11-23T03:12:18.3946818Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3946955Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3947284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3947485Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3947832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3947946Z getattr(self, test_name)() 2022-11-23T03:12:18.3948290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3948377Z fn() 2022-11-23T03:12:18.3948888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3948997Z test(self, **param_kwargs) 2022-11-23T03:12:18.3949332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3949445Z return func(*args, **kwargs) 2022-11-23T03:12:18.3949664Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3949771Z self.run_subtests( 2022-11-23T03:12:18.3950100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3950247Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3950585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3950722Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3951069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3951173Z output = model(*input) 2022-11-23T03:12:18.3951468Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3951591Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3951947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3952107Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3952449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3952560Z _lazy_init(state, module) 2022-11-23T03:12:18.3952888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3953015Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3953322Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3953432Z return func(*args, **kwargs) 2022-11-23T03:12:18.3953783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3953874Z p_assert( 2022-11-23T03:12:18.3954250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3954367Z traceback.print_stack() 2022-11-23T03:12:18.3954594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.3954818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.3955025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.3955235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.3955609Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3955980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3956149Z File "", line 1, in 2022-11-23T03:12:18.3956345Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3956473Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3956656Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3956783Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3956978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3957071Z self.run() 2022-11-23T03:12:18.3957259Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3957389Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3957713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3957835Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3958178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3958280Z getattr(self, test_name)() 2022-11-23T03:12:18.3958614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3958698Z fn() 2022-11-23T03:12:18.3959036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3959144Z test(self, **param_kwargs) 2022-11-23T03:12:18.3959475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3959584Z return func(*args, **kwargs) 2022-11-23T03:12:18.3959809Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3959901Z self.run_subtests( 2022-11-23T03:12:18.3960228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3960380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3960717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3960852Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3961198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3961301Z output = model(*input) 2022-11-23T03:12:18.3961602Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3961722Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3962248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3962411Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3962821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3962940Z _lazy_init(state, module) 2022-11-23T03:12:18.3963281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3963412Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3963735Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3963840Z return func(*args, **kwargs) 2022-11-23T03:12:18.3964205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3964371Z p_assert( 2022-11-23T03:12:18.3964773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3964932Z traceback.print_stack() 2022-11-23T03:12:18.3965127Z File "", line 1, in 2022-11-23T03:12:18.3965453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3965548Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3965732Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3965879Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3966145Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3966171Z self.run() 2022-11-23T03:12:18.3966385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3966504Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3966832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3966946Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3967402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3967523Z getattr(self, test_name)() 2022-11-23T03:12:18.3967860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3967953Z fn() 2022-11-23T03:12:18.3968274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3968317Z test(self, **param_kwargs) 2022-11-23T03:12:18.3968662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3968769Z return func(*args, **kwargs) 2022-11-23T03:12:18.3969004Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3969153Z self.run_subtests( 2022-11-23T03:12:18.3969509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3969660Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3970042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3970148Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3970833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3970935Z output = model(*input) 2022-11-23T03:12:18.3971248Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3971379Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3971740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3971907Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3972312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3972429Z _lazy_init(state, module) 2022-11-23T03:12:18.3972770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3972893Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3973221Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3973332Z return func(*args, **kwargs) 2022-11-23T03:12:18.3973696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3973787Z p_assert( 2022-11-23T03:12:18.3974109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3974276Z traceback.print_stack() 2022-11-23T03:12:18.3974671Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3975058Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.3975170Z File "", line 1, in 2022-11-23T03:12:18.3975367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3975546Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3975694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3975832Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3976026Z File "", line 1, in 2022-11-23T03:12:18.3976150Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3976235Z self.run() 2022-11-23T03:12:18.3976587Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3976718Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3976911Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3977038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3977355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3977472Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3977651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3977778Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3978293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3978406Z getattr(self, test_name)() 2022-11-23T03:12:18.3978696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3978700Z self.run() 2022-11-23T03:12:18.3979055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3979142Z fn() 2022-11-23T03:12:18.3979327Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3979461Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3979816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3980034Z test(self, **param_kwargs) 2022-11-23T03:12:18.3980252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3980373Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3980719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3980836Z return func(*args, **kwargs) 2022-11-23T03:12:18.3981223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3981413Z getattr(self, test_name)() 2022-11-23T03:12:18.3981580Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3981681Z self.run_subtests( 2022-11-23T03:12:18.3982180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3982311Z fn() 2022-11-23T03:12:18.3982594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3982740Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3983251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3983412Z test(self, **param_kwargs) 2022-11-23T03:12:18.3983765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3984181Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3984650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3984733Z return func(*args, **kwargs) 2022-11-23T03:12:18.3985026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3985134Z output = model(*input) 2022-11-23T03:12:18.3985364Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3985468Z self.run_subtests( 2022-11-23T03:12:18.3985779Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3985911Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3986252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.3986402Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.3986767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3986931Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3987274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.3987413Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.3987911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3988021Z _lazy_init(state, module) 2022-11-23T03:12:18.3988378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.3988486Z output = model(*input) 2022-11-23T03:12:18.3988969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3989100Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3989407Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.3989570Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.3989969Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3990142Z return func(*args, **kwargs) 2022-11-23T03:12:18.3990545Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.3990659Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.3991063Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3991163Z p_assert( 2022-11-23T03:12:18.3991514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.3991626Z _lazy_init(state, module) 2022-11-23T03:12:18.3991947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3992064Z traceback.print_stack() 2022-11-23T03:12:18.3992399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.3992530Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.3992853Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.3992967Z return func(*args, **kwargs) 2022-11-23T03:12:18.3993399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.3993493Z p_assert( 2022-11-23T03:12:18.3993817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.3993931Z traceback.print_stack() 2022-11-23T03:12:18.3994166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.3994395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.3994614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.3994830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.3995212Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3995597Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.3995717Z File "", line 1, in 2022-11-23T03:12:18.3996009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.3996050Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.3996240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.3996378Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.3996578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.3996664Z self.run() 2022-11-23T03:12:18.3996853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.3996986Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.3997313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.3997439Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.3997785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.3997915Z getattr(self, test_name)() 2022-11-23T03:12:18.3998244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.3998323Z fn() 2022-11-23T03:12:18.3998991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.3999103Z test(self, **param_kwargs) 2022-11-23T03:12:18.3999445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.3999557Z return func(*args, **kwargs) 2022-11-23T03:12:18.3999789Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.3999946Z self.run_subtests( 2022-11-23T03:12:18.4000303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4000449Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4000793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4000934Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4001290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4001401Z output = model(*input) 2022-11-23T03:12:18.4001714Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4001845Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4002267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4002425Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4002782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4002896Z _lazy_init(state, module) 2022-11-23T03:12:18.4003331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4003370Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4003693Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4003807Z return func(*args, **kwargs) 2022-11-23T03:12:18.4004316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4004402Z p_assert( 2022-11-23T03:12:18.4004716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4004828Z traceback.print_stack() 2022-11-23T03:12:18.4004943Z File "", line 1, in 2022-11-23T03:12:18.4005132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4005257Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4005440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4005574Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4005760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4005849Z self.run() 2022-11-23T03:12:18.4006033Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4006165Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4006484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4006608Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4006945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4007047Z getattr(self, test_name)() 2022-11-23T03:12:18.4007382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4007467Z fn() 2022-11-23T03:12:18.4007807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4007914Z test(self, **param_kwargs) 2022-11-23T03:12:18.4008251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4008359Z return func(*args, **kwargs) 2022-11-23T03:12:18.4008583Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4008727Z self.run_subtests( 2022-11-23T03:12:18.4009067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4009210Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4009547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4009683Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4010029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4010138Z output = model(*input) 2022-11-23T03:12:18.4010619Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4010741Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4011164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4011329Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4011678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4011786Z _lazy_init(state, module) 2022-11-23T03:12:18.4012215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4012254Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4012576Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4012688Z return func(*args, **kwargs) 2022-11-23T03:12:18.4013047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4013143Z p_assert( 2022-11-23T03:12:18.4013472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4013587Z traceback.print_stack() 2022-11-23T03:12:18.4013975Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4014359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4014478Z File "", line 1, in 2022-11-23T03:12:18.4014676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4014802Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4014995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4015132Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4015332Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4015427Z self.run() 2022-11-23T03:12:18.4015619Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4015754Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4016074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4016197Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4016548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4016658Z getattr(self, test_name)() 2022-11-23T03:12:18.4017000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4017088Z fn() 2022-11-23T03:12:18.4017445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4017633Z test(self, **param_kwargs) 2022-11-23T03:12:18.4017960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4018083Z return func(*args, **kwargs) 2022-11-23T03:12:18.4018318Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4018419Z self.run_subtests( 2022-11-23T03:12:18.4018908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4019054Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4019391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4019527Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4019868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4020027Z output = model(*input) 2022-11-23T03:12:18.4020505Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4020636Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4021001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4021165Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4021515Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4021625Z _lazy_init(state, module) 2022-11-23T03:12:18.4021958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4022091Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4022423Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4022538Z return func(*args, **kwargs) 2022-11-23T03:12:18.4022905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4022996Z p_assert( 2022-11-23T03:12:18.4023478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4023587Z traceback.print_stack() 2022-11-23T03:12:18.4023694Z File "", line 1, in 2022-11-23T03:12:18.4024395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4024553Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4024724Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4024837Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4025038Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4025094Z self.run() 2022-11-23T03:12:18.4025282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4025418Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4025754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4025875Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4026224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4026334Z getattr(self, test_name)() 2022-11-23T03:12:18.4026674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4026762Z fn() 2022-11-23T03:12:18.4027107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4027225Z test(self, **param_kwargs) 2022-11-23T03:12:18.4027650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4027774Z return func(*args, **kwargs) 2022-11-23T03:12:18.4028008Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4028111Z self.run_subtests( 2022-11-23T03:12:18.4028453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4028759Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4029271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4029413Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4029775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4029956Z output = model(*input) 2022-11-23T03:12:18.4030273Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4030403Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4030764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4031015Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4031277Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4031388Z _lazy_init(state, module) 2022-11-23T03:12:18.4031725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4031856Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4032190Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4032304Z return func(*args, **kwargs) 2022-11-23T03:12:18.4032670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4032762Z p_assert( 2022-11-23T03:12:18.4033080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4033195Z traceback.print_stack() 2022-11-23T03:12:18.4033432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.4033656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.4033875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.4034091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.4034487Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4034875Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4035150Z File "", line 1, in 2022-11-23T03:12:18.4035336Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4035462Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4035648Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4035781Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4035975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4036063Z self.run() 2022-11-23T03:12:18.4036248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4036428Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4036755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4036874Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4037211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4037318Z getattr(self, test_name)() 2022-11-23T03:12:18.4037649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4037731Z fn() 2022-11-23T03:12:18.4038068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4038168Z test(self, **param_kwargs) 2022-11-23T03:12:18.4038497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4038662Z return func(*args, **kwargs) 2022-11-23T03:12:18.4038889Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4038988Z self.run_subtests( 2022-11-23T03:12:18.4039317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4039461Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4039796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4039925Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4040273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4040546Z output = model(*input) 2022-11-23T03:12:18.4040864Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4041002Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4041372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4041591Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4041951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4042054Z _lazy_init(state, module) 2022-11-23T03:12:18.4042392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4042524Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4042847Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4042958Z return func(*args, **kwargs) 2022-11-23T03:12:18.4043328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4043486Z p_assert( 2022-11-23T03:12:18.4043745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4043853Z traceback.print_stack() 2022-11-23T03:12:18.4043971Z File "", line 1, in 2022-11-23T03:12:18.4044174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4044304Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4044644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4044779Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4044974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4045305Z self.run() 2022-11-23T03:12:18.4045422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4045651Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4045991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4046110Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4046458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4046570Z getattr(self, test_name)() 2022-11-23T03:12:18.4046917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4046996Z fn() 2022-11-23T03:12:18.4047350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4047466Z test(self, **param_kwargs) 2022-11-23T03:12:18.4047809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4047977Z return func(*args, **kwargs) 2022-11-23T03:12:18.4048213Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4048314Z self.run_subtests( 2022-11-23T03:12:18.4048652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4048795Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4049296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4049433Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4049784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4049968Z output = model(*input) 2022-11-23T03:12:18.4050193Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4050323Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4050671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4050828Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4051163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4051270Z _lazy_init(state, module) 2022-11-23T03:12:18.4051598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4051726Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4052038Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4052146Z return func(*args, **kwargs) 2022-11-23T03:12:18.4052509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4052598Z p_assert( 2022-11-23T03:12:18.4052903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4053014Z traceback.print_stack() 2022-11-23T03:12:18.4053390Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4053762Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4053876Z File "", line 1, in 2022-11-23T03:12:18.4054066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4054190Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4054372Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4054551Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4054752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4054840Z self.run() 2022-11-23T03:12:18.4055024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4055155Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4055472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4055587Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4055915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4056023Z getattr(self, test_name)() 2022-11-23T03:12:18.4056355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4056482Z fn() 2022-11-23T03:12:18.4056823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4056931Z test(self, **param_kwargs) 2022-11-23T03:12:18.4057263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4057372Z return func(*args, **kwargs) 2022-11-23T03:12:18.4057591Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4057690Z self.run_subtests( 2022-11-23T03:12:18.4058021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4058169Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4058503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4058644Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4058996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4059100Z output = model(*input) 2022-11-23T03:12:18.4059206Z File "", line 1, in 2022-11-23T03:12:18.4059509Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4059635Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4059824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4059948Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4060296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4060453Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4060726Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4060773Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4061202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4061264Z _lazy_init(state, module) 2022-11-23T03:12:18.4061508Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4061541Z self.run() 2022-11-23T03:12:18.4061840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4061965Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4062149Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4062271Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4062585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4062698Z return func(*args, **kwargs) 2022-11-23T03:12:18.4063058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4063182Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4063539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4063626Z p_assert( 2022-11-23T03:12:18.4064512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4064646Z getattr(self, test_name)() 2022-11-23T03:12:18.4064883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4064994Z traceback.print_stack() 2022-11-23T03:12:18.4065429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4065590Z fn() 2022-11-23T03:12:18.4065871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4065983Z test(self, **param_kwargs) 2022-11-23T03:12:18.4066319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4066433Z return func(*args, **kwargs) 2022-11-23T03:12:18.4066668Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4066770Z self.run_subtests( 2022-11-23T03:12:18.4067107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4067258Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4067605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4067749Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4068415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4068527Z output = model(*input) 2022-11-23T03:12:18.4068841Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4068972Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4069373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4069540Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4069893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4070002Z _lazy_init(state, module) 2022-11-23T03:12:18.4070386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4070469Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4070796Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4070909Z return func(*args, **kwargs) 2022-11-23T03:12:18.4071275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4071364Z p_assert( 2022-11-23T03:12:18.4071683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4071797Z traceback.print_stack() 2022-11-23T03:12:18.4072125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.4072292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.4072577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.4072756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.4073253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4073602Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4073923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4074301Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4074417Z File "", line 1, in 2022-11-23T03:12:18.4074526Z File "", line 1, in 2022-11-23T03:12:18.4074777Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4074975Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4075031Z File "", line 1, in 2022-11-23T03:12:18.4075231Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4075360Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4075591Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4075688Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4075876Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4076007Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4076193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4076375Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4076533Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4076629Z self.run() 2022-11-23T03:12:18.4077025Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4077107Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4077302Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4077390Z self.run() 2022-11-23T03:12:18.4077573Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4077703Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4077894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4077982Z self.run() 2022-11-23T03:12:18.4078164Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4078453Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4078791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4078919Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4079105Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4079238Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4079565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4079684Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4080063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4080141Z getattr(self, test_name)() 2022-11-23T03:12:18.4080463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4080582Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4080930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4081112Z getattr(self, test_name)() 2022-11-23T03:12:18.4081468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4081555Z fn() 2022-11-23T03:12:18.4081888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4082000Z getattr(self, test_name)() 2022-11-23T03:12:18.4082355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4082428Z fn() 2022-11-23T03:12:18.4082779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4082890Z test(self, **param_kwargs) 2022-11-23T03:12:18.4083230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4083366Z fn() 2022-11-23T03:12:18.4083721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4083833Z test(self, **param_kwargs) 2022-11-23T03:12:18.4084177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4084289Z return func(*args, **kwargs) 2022-11-23T03:12:18.4084627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4084739Z return func(*args, **kwargs) 2022-11-23T03:12:18.4085089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4085199Z test(self, **param_kwargs) 2022-11-23T03:12:18.4085427Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4085534Z self.run_subtests( 2022-11-23T03:12:18.4085766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4085867Z self.run_subtests( 2022-11-23T03:12:18.4086371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4086481Z return func(*args, **kwargs) 2022-11-23T03:12:18.4086984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4087134Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4087464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4087613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4087849Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4087955Z self.run_subtests( 2022-11-23T03:12:18.4088310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4088451Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4088797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4088936Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4089266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4089413Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4089929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4090034Z output = model(*input) 2022-11-23T03:12:18.4090616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4090732Z output = model(*input) 2022-11-23T03:12:18.4091086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4091228Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4091533Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4091663Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4091978Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4092106Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4092465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4092573Z output = model(*input) 2022-11-23T03:12:18.4093014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4093180Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4093535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4093699Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4094007Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4094135Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4094487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4094600Z _lazy_init(state, module) 2022-11-23T03:12:18.4094956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4095069Z _lazy_init(state, module) 2022-11-23T03:12:18.4095425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4095589Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4095929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4096061Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4096437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4096594Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4096913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4097024Z _lazy_init(state, module) 2022-11-23T03:12:18.4097349Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4097464Z return func(*args, **kwargs) 2022-11-23T03:12:18.4097787Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4097995Z return func(*args, **kwargs) 2022-11-23T03:12:18.4098345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4098446Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4098505Z File "", line 1, in 2022-11-23T03:12:18.4098957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4098996Z p_assert( 2022-11-23T03:12:18.4099306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4099396Z p_assert( 2022-11-23T03:12:18.4099776Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4099898Z return func(*args, **kwargs) 2022-11-23T03:12:18.4100228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4100342Z traceback.print_stack() 2022-11-23T03:12:18.4100540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4100663Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4100985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4101097Z traceback.print_stack() 2022-11-23T03:12:18.4101464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4101555Z p_assert( 2022-11-23T03:12:18.4101747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4101943Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4102268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4102375Z traceback.print_stack() 2022-11-23T03:12:18.4102573Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4102664Z self.run() 2022-11-23T03:12:18.4102853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4102987Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4103308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4103428Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4103770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4104245Z getattr(self, test_name)() 2022-11-23T03:12:18.4104629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4104708Z fn() 2022-11-23T03:12:18.4105064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4105151Z test(self, **param_kwargs) 2022-11-23T03:12:18.4105426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4105538Z return func(*args, **kwargs) 2022-11-23T03:12:18.4105766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4105867Z self.run_subtests( 2022-11-23T03:12:18.4106203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4106355Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4106710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4106851Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4107210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4107317Z output = model(*input) 2022-11-23T03:12:18.4107621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4107908Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4108258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4108415Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4108755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4108940Z _lazy_init(state, module) 2022-11-23T03:12:18.4109279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4109404Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4109811Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4109825Z return func(*args, **kwargs) 2022-11-23T03:12:18.4110176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4110262Z p_assert( 2022-11-23T03:12:18.4110574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4110686Z traceback.print_stack() 2022-11-23T03:12:18.4110911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.4111208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.4111413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.4111622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.4112001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4112542Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4112920Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4113302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4113536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.4113756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.4113971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.4114350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4114571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.4114948Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4115328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4115709Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4115933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.4116152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.4116368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.4116748Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4117124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4117348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.4117771Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4118305Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4118529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.4118739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.4118949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.4119316Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4119534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.4119895Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4120312Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4120846Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4121072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.4121293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.4121513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.4121889Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4122113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.4122496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4122873Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4123250Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4124146Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4124852Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4125744Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4125974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.4126197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.4126417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.4126798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4127030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.4127458Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4127847Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4128227Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4128448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.4128659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.4129030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.4129393Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4129667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.4130034Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4130396Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4130758Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4130973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.4131184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.4131388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.4131758Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4131980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.4132409Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4132888Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4133355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4133499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.4133708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.4133933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.4134308Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4134526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.4134905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4135281Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4135657Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4135879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.4136150Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.4136373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.4136749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4136974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.4137339Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4137713Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4138238Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4138503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.4138712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.4138922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.4139283Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4139498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.4139860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4140224Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4140577Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4141486Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4142281Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4143008Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4143245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.4143469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.4143687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.4144505Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4144767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.4145149Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4145537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4145916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4146149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.4146371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.4146590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.4146971Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4147200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.4147576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4148020Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4148395Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4148619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.4148832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.4149048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.4149578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4149942Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4150170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.4150533Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4150895Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4151108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.4151320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.4151524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.4151887Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4152105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.4152474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4152835Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4153195Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4153409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.4153619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.4153829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.4154190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4154453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.4154827Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4155189Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4155731Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4155952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.4156172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.4156389Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.4156817Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4157043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.4157410Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4157786Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4158159Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4159031Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4159737Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4160445Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4160666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.4160881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.4161192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.4161473Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4161694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.4162059Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4162417Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4162786Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4163004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.4163265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.4163483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.4163849Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4164068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.4164431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4164798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4165164Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4165417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.4165827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.4166029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.4166467Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4166783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4167010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.4167386Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4167763Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4168658Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4169417Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4170125Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4170358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.4170566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.4170947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.4171430Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4171713Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4171939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.4172320Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4172751Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4172982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.4173203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.4173413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.4173796Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4174024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.4174401Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4174835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4175213Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4175437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.4175657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.4175927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.4176248Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4176473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.4176848Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4177386Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4177752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4178459Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4179347Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4180081Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4180403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.4180537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.4180756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.4181135Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4181356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.4181784Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4182176Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4182712Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4182813Z dist init r=2, world=4 2022-11-23T03:12:18.4183121Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4183415Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4184304Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4184760Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4185051Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4185342Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4185622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4185910Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4186208Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4186497Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4186942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4187221Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4187319Z dist init r=1, world=4 2022-11-23T03:12:18.4187622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4187920Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4188270Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4188556Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4188830Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4189110Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4189453Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4189740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4190017Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4190294Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4190571Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4190850Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4190993Z dist init r=0, world=4 2022-11-23T03:12:18.4191293Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4191585Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4192039Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4192328Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4192622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4192919Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4193250Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4193498Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4193787Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4194075Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4194367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4194657Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4194756Z dist init r=3, world=4 2022-11-23T03:12:18.4195065Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4195361Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4195655Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4196321Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4196623Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4196911Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4197200Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4197487Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4197777Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4198120Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4198408Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4198697Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4198788Z ok (8.027s) 2022-11-23T03:12:18.4199277Z test_mixture_of_experts_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16901 2022-11-23T03:12:18.4199647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16902 2022-11-23T03:12:18.4199861Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16903 2022-11-23T03:12:18.4200065Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16904 2022-11-23T03:12:18.4200447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4200611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4200981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4201160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4201509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4201673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4202046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4202284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4202588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4202752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4203114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4203294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4203646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4203803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4204252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4204412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4204652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.4204969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.4205265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.4205485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.4205860Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4206229Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4206681Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4207008Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4207217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.4207426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.4207630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.4207832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.4208806Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4208911Z warnings.warn( 2022-11-23T03:12:18.4209874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4209969Z warnings.warn( 2022-11-23T03:12:18.4210209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.4211144Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4211236Z warnings.warn( 2022-11-23T03:12:18.4211453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.4212393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4212490Z warnings.warn( 2022-11-23T03:12:18.4212923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.4213157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.4213546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4213925Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4214043Z File "", line 1, in 2022-11-23T03:12:18.4214242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4214367Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4214560Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4214750Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4214954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4215047Z self.run() 2022-11-23T03:12:18.4215237Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4215371Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4215704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4215821Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4216171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4216373Z getattr(self, test_name)() 2022-11-23T03:12:18.4216631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4216717Z fn() 2022-11-23T03:12:18.4217077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4217189Z test(self, **param_kwargs) 2022-11-23T03:12:18.4217525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4217639Z return func(*args, **kwargs) 2022-11-23T03:12:18.4217871Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4217972Z self.run_subtests( 2022-11-23T03:12:18.4218468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4218796Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4219144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4219287Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4219659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4219762Z output = model(*input) 2022-11-23T03:12:18.4220075Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4220206Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4220617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4220786Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4221143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4221251Z _lazy_init(state, module) 2022-11-23T03:12:18.4221589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4221718Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4222094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4222217Z return func(*args, **kwargs) 2022-11-23T03:12:18.4222586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4222740Z p_assert( 2022-11-23T03:12:18.4222999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4223114Z traceback.print_stack() 2022-11-23T03:12:18.4223498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4224413Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4224572Z File "", line 1, in 2022-11-23T03:12:18.4224776Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4224908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4225100Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4225240Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4225474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4225527Z self.run() 2022-11-23T03:12:18.4225720Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4225853Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4226190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4226311Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4226665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4226784Z getattr(self, test_name)() 2022-11-23T03:12:18.4227131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4227210Z fn() 2022-11-23T03:12:18.4227561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4227672Z test(self, **param_kwargs) 2022-11-23T03:12:18.4228015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4228126Z return func(*args, **kwargs) 2022-11-23T03:12:18.4228362Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4228462Z self.run_subtests( 2022-11-23T03:12:18.4228802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4228954Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4229462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4229601Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4229952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4230055Z output = model(*input) 2022-11-23T03:12:18.4230359Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4230484Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4230833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4230986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4231583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4231747Z _lazy_init(state, module) 2022-11-23T03:12:18.4232046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4232177Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4232502Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4232614Z return func(*args, **kwargs) 2022-11-23T03:12:18.4232980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4233064Z p_assert( 2022-11-23T03:12:18.4233388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4233501Z traceback.print_stack() 2022-11-23T03:12:18.4233685Z File "", line 1, in 2022-11-23T03:12:18.4233883Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4234014Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4234203Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4234342Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4234538Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4234630Z self.run() 2022-11-23T03:12:18.4234817Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4234949Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4235280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4235401Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4235750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4235864Z getattr(self, test_name)() 2022-11-23T03:12:18.4236214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4236302Z fn() 2022-11-23T03:12:18.4236655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4236767Z test(self, **param_kwargs) 2022-11-23T03:12:18.4237107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4237220Z return func(*args, **kwargs) 2022-11-23T03:12:18.4237454Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4237550Z self.run_subtests( 2022-11-23T03:12:18.4237887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4238042Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4238389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4238529Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4239046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4239151Z output = model(*input) 2022-11-23T03:12:18.4239453Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4239571Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4239922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4240081Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4240475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4240589Z _lazy_init(state, module) 2022-11-23T03:12:18.4240920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4241047Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4241530Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4241692Z return func(*args, **kwargs) 2022-11-23T03:12:18.4242064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4242155Z p_assert( 2022-11-23T03:12:18.4242483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4242596Z traceback.print_stack() 2022-11-23T03:12:18.4242769Z File "", line 1, in 2022-11-23T03:12:18.4242972Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4243102Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4243336Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4243424Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4243626Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4243720Z self.run() 2022-11-23T03:12:18.4243913Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4244046Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4244532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4244651Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4244982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4245098Z getattr(self, test_name)() 2022-11-23T03:12:18.4245436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4245518Z fn() 2022-11-23T03:12:18.4246040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4246188Z test(self, **param_kwargs) 2022-11-23T03:12:18.4246496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4246608Z return func(*args, **kwargs) 2022-11-23T03:12:18.4246834Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4246936Z self.run_subtests( 2022-11-23T03:12:18.4247277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4247435Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4247783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4247925Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4248286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4248392Z output = model(*input) 2022-11-23T03:12:18.4248697Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4248827Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4249188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4249352Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4249916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4250030Z _lazy_init(state, module) 2022-11-23T03:12:18.4250360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4250487Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4250796Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4250905Z return func(*args, **kwargs) 2022-11-23T03:12:18.4251256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4251344Z p_assert( 2022-11-23T03:12:18.4251653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4251763Z traceback.print_stack() 2022-11-23T03:12:18.4252042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.4252267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.4252483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.4252703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.4253079Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4253448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4253564Z File "", line 1, in 2022-11-23T03:12:18.4253755Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4253885Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4254074Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4254203Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4254397Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4254487Z self.run() 2022-11-23T03:12:18.4254671Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4254801Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4255121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4255239Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4255580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4255682Z getattr(self, test_name)() 2022-11-23T03:12:18.4256021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4256105Z fn() 2022-11-23T03:12:18.4256444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4256553Z test(self, **param_kwargs) 2022-11-23T03:12:18.4256885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4256993Z return func(*args, **kwargs) 2022-11-23T03:12:18.4257213Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4257311Z self.run_subtests( 2022-11-23T03:12:18.4257637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4257785Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4258173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4258317Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4258667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4258774Z output = model(*input) 2022-11-23T03:12:18.4259077Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4259196Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4259547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4259706Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4260047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4260204Z _lazy_init(state, module) 2022-11-23T03:12:18.4260534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4260660Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4260972Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4261075Z return func(*args, **kwargs) 2022-11-23T03:12:18.4261427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4261516Z p_assert( 2022-11-23T03:12:18.4261828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4261938Z traceback.print_stack() 2022-11-23T03:12:18.4262050Z File "", line 1, in 2022-11-23T03:12:18.4262421Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4262551Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4262742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4262881Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4263081Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4263172Z self.run() 2022-11-23T03:12:18.4263361Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4263494Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4263826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4264335Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4264642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4264820Z getattr(self, test_name)() 2022-11-23T03:12:18.4265316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4265346Z fn() 2022-11-23T03:12:18.4279629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4279777Z test(self, **param_kwargs) 2022-11-23T03:12:18.4280377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4280500Z return func(*args, **kwargs) 2022-11-23T03:12:18.4280747Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4280854Z self.run_subtests( 2022-11-23T03:12:18.4281212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4281361Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4281896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4282068Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4282449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4282560Z output = model(*input) 2022-11-23T03:12:18.4282881Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4283014Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4283385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4283547Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4283912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4284183Z _lazy_init(state, module) 2022-11-23T03:12:18.4284459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4284599Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4284932Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4285051Z return func(*args, **kwargs) 2022-11-23T03:12:18.4285423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4285509Z p_assert( 2022-11-23T03:12:18.4285838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4285957Z traceback.print_stack() 2022-11-23T03:12:18.4286453Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4286839Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4286879Z File "", line 1, in 2022-11-23T03:12:18.4287190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4287275Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4287513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4287669Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4287875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4287971Z self.run() 2022-11-23T03:12:18.4288166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4288303Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4288643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4288774Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4289126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4289242Z getattr(self, test_name)() 2022-11-23T03:12:18.4289596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4289687Z fn() 2022-11-23T03:12:18.4290167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4290285Z test(self, **param_kwargs) 2022-11-23T03:12:18.4290639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4290758Z return func(*args, **kwargs) 2022-11-23T03:12:18.4290994Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4291105Z self.run_subtests( 2022-11-23T03:12:18.4291515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4291677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4292108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4292256Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4292627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4292737Z output = model(*input) 2022-11-23T03:12:18.4293046Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4293178Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4293547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4293784Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4294148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4294262Z _lazy_init(state, module) 2022-11-23T03:12:18.4294607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4294819Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4295313Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4295489Z return func(*args, **kwargs) 2022-11-23T03:12:18.4295978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4296074Z p_assert( 2022-11-23T03:12:18.4296423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4296544Z traceback.print_stack() 2022-11-23T03:12:18.4296669Z File "", line 1, in 2022-11-23T03:12:18.4296877Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4297004Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4297277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4297341Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4297545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4297640Z self.run() 2022-11-23T03:12:18.4297834Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4297970Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4298293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4298424Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4298778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4298981Z getattr(self, test_name)() 2022-11-23T03:12:18.4299246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4299335Z fn() 2022-11-23T03:12:18.4299693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4299808Z test(self, **param_kwargs) 2022-11-23T03:12:18.4300151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4300266Z return func(*args, **kwargs) 2022-11-23T03:12:18.4300504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4300611Z self.run_subtests( 2022-11-23T03:12:18.4301018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4301181Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4301538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4301686Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4302113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4302155Z output = model(*input) 2022-11-23T03:12:18.4302474Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4302618Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4302975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4303201Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4303563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4303675Z _lazy_init(state, module) 2022-11-23T03:12:18.4304484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4304610Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4304976Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4305086Z return func(*args, **kwargs) 2022-11-23T03:12:18.4305473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4305571Z p_assert( 2022-11-23T03:12:18.4305896Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4306008Z traceback.print_stack() 2022-11-23T03:12:18.4306253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.4306491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.4306637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.4306867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.4307261Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4307651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4307777Z File "", line 1, in 2022-11-23T03:12:18.4307984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4308111Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4308310Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4308452Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4308660Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4308756Z self.run() 2022-11-23T03:12:18.4308954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4309094Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4309431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4309549Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4309909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4310118Z getattr(self, test_name)() 2022-11-23T03:12:18.4310567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4310581Z fn() 2022-11-23T03:12:18.4310943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4311060Z test(self, **param_kwargs) 2022-11-23T03:12:18.4311408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4311518Z return func(*args, **kwargs) 2022-11-23T03:12:18.4311761Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4311866Z self.run_subtests( 2022-11-23T03:12:18.4312207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4312435Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4312792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4312937Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4313303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4313408Z output = model(*input) 2022-11-23T03:12:18.4313723Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4313855Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4314224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4314392Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4314758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4314871Z _lazy_init(state, module) 2022-11-23T03:12:18.4315214Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4315341Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4315671Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4315786Z return func(*args, **kwargs) 2022-11-23T03:12:18.4316156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4316248Z p_assert( 2022-11-23T03:12:18.4316578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4316697Z traceback.print_stack() 2022-11-23T03:12:18.4316825Z File "", line 1, in 2022-11-23T03:12:18.4317023Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4339966Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4340239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4340389Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4340602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4340704Z self.run() 2022-11-23T03:12:18.4340903Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4341035Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4341419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4341551Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4341967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4342215Z getattr(self, test_name)() 2022-11-23T03:12:18.4342602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4342695Z fn() 2022-11-23T03:12:18.4343057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4343164Z test(self, **param_kwargs) 2022-11-23T03:12:18.4343518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4343636Z return func(*args, **kwargs) 2022-11-23T03:12:18.4344150Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4344347Z self.run_subtests( 2022-11-23T03:12:18.4344687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4345012Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4345378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4345544Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4345861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4346022Z output = model(*input) 2022-11-23T03:12:18.4346250Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4346387Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4346757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4346927Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4347294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4347400Z _lazy_init(state, module) 2022-11-23T03:12:18.4347746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4347881Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4348277Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4348331Z return func(*args, **kwargs) 2022-11-23T03:12:18.4348703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4348800Z p_assert( 2022-11-23T03:12:18.4349130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4349241Z traceback.print_stack() 2022-11-23T03:12:18.4349645Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4350035Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4350158Z File "", line 1, in 2022-11-23T03:12:18.4350360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4350496Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4350693Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4350838Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4351034Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4351131Z self.run() 2022-11-23T03:12:18.4351324Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4351468Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4351876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4352016Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4352380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4352498Z getattr(self, test_name)() 2022-11-23T03:12:18.4352838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4352928Z fn() 2022-11-23T03:12:18.4353284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4353400Z test(self, **param_kwargs) 2022-11-23T03:12:18.4353757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4353940Z return func(*args, **kwargs) 2022-11-23T03:12:18.4354190Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4354382Z self.run_subtests( 2022-11-23T03:12:18.4354839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4355034Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4355290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4355526Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4355902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4356008Z output = model(*input) 2022-11-23T03:12:18.4356367Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4356495Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4356861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4357129Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4357482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4357647Z _lazy_init(state, module) 2022-11-23T03:12:18.4357968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4358101Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4358530Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4358632Z return func(*args, **kwargs) 2022-11-23T03:12:18.4359076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4359142Z p_assert( 2022-11-23T03:12:18.4359430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4359640Z traceback.print_stack() 2022-11-23T03:12:18.4359761Z File "", line 1, in 2022-11-23T03:12:18.4359957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4360092Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4360295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4360432Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4360637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4360824Z self.run() 2022-11-23T03:12:18.4361023Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4361157Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4361549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4361676Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4362026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4362143Z getattr(self, test_name)() 2022-11-23T03:12:18.4362496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4362585Z fn() 2022-11-23T03:12:18.4362939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4363054Z test(self, **param_kwargs) 2022-11-23T03:12:18.4363398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4363567Z return func(*args, **kwargs) 2022-11-23T03:12:18.4363800Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4363905Z self.run_subtests( 2022-11-23T03:12:18.4364249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4364403Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4364751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4364896Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4365318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4365372Z output = model(*input) 2022-11-23T03:12:18.4365679Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4365819Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4366185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4366351Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4366709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4366821Z _lazy_init(state, module) 2022-11-23T03:12:18.4367160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4367292Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4367608Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4367724Z return func(*args, **kwargs) 2022-11-23T03:12:18.4368095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4368191Z p_assert( 2022-11-23T03:12:18.4368516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4368633Z traceback.print_stack() 2022-11-23T03:12:18.4368870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.4369104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.4369368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.4369602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.4370075Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4370439Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4370570Z File "", line 1, in 2022-11-23T03:12:18.4370775Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4370908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4371101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4371233Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4371438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4371533Z self.run() 2022-11-23T03:12:18.4371726Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4371863Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4372198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4372374Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4372730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4372836Z getattr(self, test_name)() 2022-11-23T03:12:18.4373330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4373424Z fn() 2022-11-23T03:12:18.4373781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4373895Z test(self, **param_kwargs) 2022-11-23T03:12:18.4374238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4374354Z return func(*args, **kwargs) 2022-11-23T03:12:18.4374591Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4374739Z self.run_subtests( 2022-11-23T03:12:18.4375129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4375276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4375540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4375684Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4376085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4376259Z output = model(*input) 2022-11-23T03:12:18.4376508Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4376598Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4398288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4398512Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4398930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4399046Z _lazy_init(state, module) 2022-11-23T03:12:18.4399516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4399667Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4400012Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4400130Z return func(*args, **kwargs) 2022-11-23T03:12:18.4400496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4400597Z p_assert( 2022-11-23T03:12:18.4400932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4401204Z traceback.print_stack() 2022-11-23T03:12:18.4401352Z File "", line 1, in 2022-11-23T03:12:18.4401569Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4401717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4401903Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4402103Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4402323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4402501Z self.run() 2022-11-23T03:12:18.4402621Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4402765Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4403113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4403328Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4403677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4403793Z getattr(self, test_name)() 2022-11-23T03:12:18.4404143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4404231Z fn() 2022-11-23T03:12:18.4404596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4404720Z test(self, **param_kwargs) 2022-11-23T03:12:18.4405067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4405188Z return func(*args, **kwargs) 2022-11-23T03:12:18.4405416Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4405615Z self.run_subtests( 2022-11-23T03:12:18.4405879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4406037Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4406393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4406543Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4406912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4407030Z output = model(*input) 2022-11-23T03:12:18.4407336Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4407470Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4407839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4408017Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4408386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4408503Z _lazy_init(state, module) 2022-11-23T03:12:18.4408847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4408982Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4409299Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4409419Z return func(*args, **kwargs) 2022-11-23T03:12:18.4409791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4409887Z p_assert( 2022-11-23T03:12:18.4410275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4410406Z traceback.print_stack() 2022-11-23T03:12:18.4410814Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4411213Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4411324Z File "", line 1, in 2022-11-23T03:12:18.4411527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4411670Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4411872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4412018Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4412230Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4412386Z self.run() 2022-11-23T03:12:18.4412589Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4412719Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4413131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4413183Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4413619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4413663Z getattr(self, test_name)() 2022-11-23T03:12:18.4414022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4414186Z fn() 2022-11-23T03:12:18.4414561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4414581Z test(self, **param_kwargs) 2022-11-23T03:12:18.4414941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4415060Z return func(*args, **kwargs) 2022-11-23T03:12:18.4415308Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4415414Z self.run_subtests( 2022-11-23T03:12:18.4415759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4415916Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4416257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4416406Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4416777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4416894Z output = model(*input) 2022-11-23T03:12:18.4417217Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4417349Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4417717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4417887Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4418248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4418353Z _lazy_init(state, module) 2022-11-23T03:12:18.4418696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4418866Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4419166Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4419345Z return func(*args, **kwargs) 2022-11-23T03:12:18.4419734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4419831Z p_assert( 2022-11-23T03:12:18.4420149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4420268Z traceback.print_stack() 2022-11-23T03:12:18.4420394Z File "", line 1, in 2022-11-23T03:12:18.4420599Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4420741Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4420937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4421086Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4421298Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4421439Z self.run() 2022-11-23T03:12:18.4421646Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4421791Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4422125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4422255Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4422610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4422727Z getattr(self, test_name)() 2022-11-23T03:12:18.4423080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4423160Z fn() 2022-11-23T03:12:18.4423518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4423645Z test(self, **param_kwargs) 2022-11-23T03:12:18.4424402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4424556Z return func(*args, **kwargs) 2022-11-23T03:12:18.4424804Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4424900Z self.run_subtests( 2022-11-23T03:12:18.4425258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4425415Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4425765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4425900Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4426199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4426327Z output = model(*input) 2022-11-23T03:12:18.4426655Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4426794Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4427171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4427331Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4427690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4427806Z _lazy_init(state, module) 2022-11-23T03:12:18.4428152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4428290Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4428624Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4428830Z return func(*args, **kwargs) 2022-11-23T03:12:18.4429217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4429303Z p_assert( 2022-11-23T03:12:18.4429632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4429755Z traceback.print_stack() 2022-11-23T03:12:18.4429999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.4430237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.4430475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.4430714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.4431185Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4431299Z File "", line 1, in 2022-11-23T03:12:18.4431509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4431646Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4431849Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4431992Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4432205Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4432310Z self.run() 2022-11-23T03:12:18.4432511Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4432639Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4432981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4433114Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4433466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4433584Z getattr(self, test_name)() 2022-11-23T03:12:18.4433934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4434026Z fn() 2022-11-23T03:12:18.4434370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4434492Z test(self, **param_kwargs) 2022-11-23T03:12:18.4434846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4434972Z return func(*args, **kwargs) 2022-11-23T03:12:18.4435218Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4435337Z self.run_subtests( 2022-11-23T03:12:18.4435686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4435901Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4436244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4436395Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4436767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4436887Z output = model(*input) 2022-11-23T03:12:18.4437211Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4437351Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4437721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4437945Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4438321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4438425Z _lazy_init(state, module) 2022-11-23T03:12:18.4438777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4438917Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4439251Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4439369Z return func(*args, **kwargs) 2022-11-23T03:12:18.4439743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4439843Z p_assert( 2022-11-23T03:12:18.4440236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4440360Z traceback.print_stack() 2022-11-23T03:12:18.4440754Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4441153Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4441540Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4441668Z File "", line 1, in 2022-11-23T03:12:18.4441938Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4442087Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4442293Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4442434Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4442654Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4442758Z self.run() 2022-11-23T03:12:18.4442963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4443113Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4443452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4443585Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4443925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4444046Z getattr(self, test_name)() 2022-11-23T03:12:18.4444405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4444528Z fn() 2022-11-23T03:12:18.4444862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4444984Z test(self, **param_kwargs) 2022-11-23T03:12:18.4445337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4445460Z return func(*args, **kwargs) 2022-11-23T03:12:18.4445688Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4445801Z self.run_subtests( 2022-11-23T03:12:18.4446152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4446411Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4446671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4446902Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4447251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4447370Z output = model(*input) 2022-11-23T03:12:18.4447683Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4447818Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4448192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4448371Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4448730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4448847Z _lazy_init(state, module) 2022-11-23T03:12:18.4449190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4449377Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4449701Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4449825Z return func(*args, **kwargs) 2022-11-23T03:12:18.4450200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4450299Z p_assert( 2022-11-23T03:12:18.4450628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4450751Z traceback.print_stack() 2022-11-23T03:12:18.4450882Z File "", line 1, in 2022-11-23T03:12:18.4451111Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4451331Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4451433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4451591Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4451803Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4451907Z self.run() 2022-11-23T03:12:18.4452105Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4452247Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4452579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4452695Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4453049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4453168Z getattr(self, test_name)() 2022-11-23T03:12:18.4453525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4453614Z fn() 2022-11-23T03:12:18.4453982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4454100Z test(self, **param_kwargs) 2022-11-23T03:12:18.4454436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4454560Z return func(*args, **kwargs) 2022-11-23T03:12:18.4454802Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4454917Z self.run_subtests( 2022-11-23T03:12:18.4455265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4455426Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4455781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4455927Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4456330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4456454Z output = model(*input) 2022-11-23T03:12:18.4456780Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4456924Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4457297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4457466Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4457826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4457945Z _lazy_init(state, module) 2022-11-23T03:12:18.4458295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4458474Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4458818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4458945Z return func(*args, **kwargs) 2022-11-23T03:12:18.4459319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4459420Z p_assert( 2022-11-23T03:12:18.4459755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4459877Z traceback.print_stack() 2022-11-23T03:12:18.4459988Z File "", line 1, in 2022-11-23T03:12:18.4460193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4460330Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4460528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4460680Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4460893Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4460992Z self.run() 2022-11-23T03:12:18.4461188Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4461315Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4461654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4461781Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4462138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4462259Z getattr(self, test_name)() 2022-11-23T03:12:18.4462608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4462707Z fn() 2022-11-23T03:12:18.4463077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4463184Z test(self, **param_kwargs) 2022-11-23T03:12:18.4463533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4463654Z return func(*args, **kwargs) 2022-11-23T03:12:18.4464313Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4464375Z self.run_subtests( 2022-11-23T03:12:18.4464736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4464843Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4465203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4465346Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4465792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4465915Z output = model(*input) 2022-11-23T03:12:18.4466239Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4466379Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4466795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4466936Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4467294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4467398Z _lazy_init(state, module) 2022-11-23T03:12:18.4467749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4467961Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4468298Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4468421Z return func(*args, **kwargs) 2022-11-23T03:12:18.4468802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4468900Z p_assert( 2022-11-23T03:12:18.4469234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4469388Z traceback.print_stack() 2022-11-23T03:12:18.4469654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.4469891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.4470128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.4470371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.4470772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4470904Z File "", line 1, in 2022-11-23T03:12:18.4471114Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4471237Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4471437Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4471586Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4471797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4471897Z self.run() 2022-11-23T03:12:18.4472098Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4472245Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4472654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4472706Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4473066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4473189Z getattr(self, test_name)() 2022-11-23T03:12:18.4473543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4473635Z fn() 2022-11-23T03:12:18.4473999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4474124Z test(self, **param_kwargs) 2022-11-23T03:12:18.4474457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4474586Z return func(*args, **kwargs) 2022-11-23T03:12:18.4474878Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4474999Z self.run_subtests( 2022-11-23T03:12:18.4475349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4475507Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4475859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4476011Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4476364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4476481Z output = model(*input) 2022-11-23T03:12:18.4476801Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4476995Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4477388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4477541Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4477898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4478018Z _lazy_init(state, module) 2022-11-23T03:12:18.4478369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4478493Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4478829Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4478955Z return func(*args, **kwargs) 2022-11-23T03:12:18.4479324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4479442Z p_assert( 2022-11-23T03:12:18.4479777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4479904Z traceback.print_stack() 2022-11-23T03:12:18.4480283Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4480678Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4481128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4481227Z File "", line 1, in 2022-11-23T03:12:18.4481436Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4481575Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4481775Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4481924Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4482143Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4482230Z self.run() 2022-11-23T03:12:18.4482434Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4482580Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4482925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4483057Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4483410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4483529Z getattr(self, test_name)() 2022-11-23T03:12:18.4483867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4484042Z fn() 2022-11-23T03:12:18.4484413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4484534Z test(self, **param_kwargs) 2022-11-23T03:12:18.4484886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4485012Z return func(*args, **kwargs) 2022-11-23T03:12:18.4485258Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4485442Z self.run_subtests( 2022-11-23T03:12:18.4485772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4485936Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4486296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4486498Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4486872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4487021Z output = model(*input) 2022-11-23T03:12:18.4487308Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4487443Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4487799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4487973Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4488502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4488599Z _lazy_init(state, module) 2022-11-23T03:12:18.4488898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4489044Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4489377Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4489500Z return func(*args, **kwargs) 2022-11-23T03:12:18.4489858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4489957Z p_assert( 2022-11-23T03:12:18.4490285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4490408Z traceback.print_stack() 2022-11-23T03:12:18.4490534Z File "", line 1, in 2022-11-23T03:12:18.4490736Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4490880Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4491083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4491216Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4491425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4491527Z self.run() 2022-11-23T03:12:18.4491725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4491876Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4492212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4492344Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4492705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4492810Z getattr(self, test_name)() 2022-11-23T03:12:18.4493168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4493316Z fn() 2022-11-23T03:12:18.4493684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4493808Z test(self, **param_kwargs) 2022-11-23T03:12:18.4494158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4494282Z return func(*args, **kwargs) 2022-11-23T03:12:18.4494510Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4494620Z self.run_subtests( 2022-11-23T03:12:18.4494962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4495120Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4495530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4495681Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4496051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4496224Z output = model(*input) 2022-11-23T03:12:18.4496528Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4496667Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4497039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4497209Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4497569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4497687Z _lazy_init(state, module) 2022-11-23T03:12:18.4498033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4498173Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4498603Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4498611Z return func(*args, **kwargs) 2022-11-23T03:12:18.4498982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4499077Z p_assert( 2022-11-23T03:12:18.4499492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4499610Z traceback.print_stack() 2022-11-23T03:12:18.4499744Z File "", line 1, in 2022-11-23T03:12:18.4499932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4500071Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4500280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4500341Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4500550Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4500650Z self.run() 2022-11-23T03:12:18.4500850Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4500994Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4501331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4501447Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4501807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4501931Z getattr(self, test_name)() 2022-11-23T03:12:18.4502339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4502443Z fn() 2022-11-23T03:12:18.4502821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4502927Z test(self, **param_kwargs) 2022-11-23T03:12:18.4503281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4503387Z return func(*args, **kwargs) 2022-11-23T03:12:18.4503628Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4503742Z self.run_subtests( 2022-11-23T03:12:18.4504464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4504632Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4505107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4505236Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4505611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4505726Z output = model(*input) 2022-11-23T03:12:18.4505981Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4506098Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4506469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4506709Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4507011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4507134Z _lazy_init(state, module) 2022-11-23T03:12:18.4507483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4507608Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4507945Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4508069Z return func(*args, **kwargs) 2022-11-23T03:12:18.4508442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4508539Z p_assert( 2022-11-23T03:12:18.4508870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4508995Z traceback.print_stack() 2022-11-23T03:12:18.4509235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.4509461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.4509711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.4509949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.4510346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4510478Z File "", line 1, in 2022-11-23T03:12:18.4510689Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4510831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4511095Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4511324Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4511441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4511550Z self.run() 2022-11-23T03:12:18.4511856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4512014Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4512356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4512490Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4512850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4512955Z getattr(self, test_name)() 2022-11-23T03:12:18.4513306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4513402Z fn() 2022-11-23T03:12:18.4513760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4514045Z test(self, **param_kwargs) 2022-11-23T03:12:18.4514400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4514521Z return func(*args, **kwargs) 2022-11-23T03:12:18.4514748Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4514857Z self.run_subtests( 2022-11-23T03:12:18.4515207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4515369Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4515726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4515877Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4516250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4516438Z output = model(*input) 2022-11-23T03:12:18.4516837Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4516887Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4517257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4517431Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4517791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4517912Z _lazy_init(state, module) 2022-11-23T03:12:18.4518259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4518400Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4518736Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4518850Z return func(*args, **kwargs) 2022-11-23T03:12:18.4519235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4519409Z p_assert( 2022-11-23T03:12:18.4519662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4519785Z traceback.print_stack() 2022-11-23T03:12:18.4520175Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4520630Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4521027Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4521138Z File "", line 1, in 2022-11-23T03:12:18.4521272Z File "", line 1, in 2022-11-23T03:12:18.4521539Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4521688Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4521895Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4522049Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4522256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4522379Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4522587Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4522685Z self.run() 2022-11-23T03:12:18.4522881Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4523030Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4523234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4523438Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4523652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4523737Z self.run() 2022-11-23T03:12:18.4524081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4524210Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4524407Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4524554Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4524914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4525037Z getattr(self, test_name)() 2022-11-23T03:12:18.4525373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4525493Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4525859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4525956Z fn() 2022-11-23T03:12:18.4526312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4526436Z getattr(self, test_name)() 2022-11-23T03:12:18.4526793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4526914Z test(self, **param_kwargs) 2022-11-23T03:12:18.4527266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4527344Z fn() 2022-11-23T03:12:18.4527696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4527822Z return func(*args, **kwargs) 2022-11-23T03:12:18.4528184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4528305Z test(self, **param_kwargs) 2022-11-23T03:12:18.4528552Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4528664Z self.run_subtests( 2022-11-23T03:12:18.4528999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4529123Z return func(*args, **kwargs) 2022-11-23T03:12:18.4529471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4529629Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4529867Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4529983Z self.run_subtests( 2022-11-23T03:12:18.4530386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4530546Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4530878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4531041Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4531415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4531531Z output = model(*input) 2022-11-23T03:12:18.4531889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4532042Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4532406Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4532618Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4532992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4533093Z output = model(*input) 2022-11-23T03:12:18.4533566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4533644Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4533966Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4534109Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4534474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4534592Z _lazy_init(state, module) 2022-11-23T03:12:18.4534976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4535133Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4535480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4535624Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4535980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4536096Z _lazy_init(state, module) 2022-11-23T03:12:18.4536430Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4536554Z return func(*args, **kwargs) 2022-11-23T03:12:18.4536894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4537017Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4537401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4537500Z p_assert( 2022-11-23T03:12:18.4537834Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4537961Z return func(*args, **kwargs) 2022-11-23T03:12:18.4538290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4538411Z traceback.print_stack() 2022-11-23T03:12:18.4538783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4538869Z p_assert( 2022-11-23T03:12:18.4539199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4539322Z traceback.print_stack() 2022-11-23T03:12:18.4539554Z File "", line 1, in 2022-11-23T03:12:18.4539716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4539867Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4540066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4540197Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4540408Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4540511Z self.run() 2022-11-23T03:12:18.4540711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4540861Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4541204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4541337Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4541699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4541922Z getattr(self, test_name)() 2022-11-23T03:12:18.4542293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4542389Z fn() 2022-11-23T03:12:18.4542751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4542870Z test(self, **param_kwargs) 2022-11-23T03:12:18.4543218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4543342Z return func(*args, **kwargs) 2022-11-23T03:12:18.4543666Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4543708Z self.run_subtests( 2022-11-23T03:12:18.4544267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4544448Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4544765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4544918Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4545289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4545405Z output = model(*input) 2022-11-23T03:12:18.4545720Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4545843Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4546215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4546388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4546761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4546882Z _lazy_init(state, module) 2022-11-23T03:12:18.4547254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4547373Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4547708Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4547815Z return func(*args, **kwargs) 2022-11-23T03:12:18.4548197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4548297Z p_assert( 2022-11-23T03:12:18.4548626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4548749Z traceback.print_stack() 2022-11-23T03:12:18.4549072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.4549329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.4549569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.4549788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.4550186Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4550577Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4550966Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4551352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4551556Z File "", line 1, in 2022-11-23T03:12:18.4551771Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4551914Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4552117Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4552250Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4552461Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4552567Z self.run() 2022-11-23T03:12:18.4552767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4552917Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4553260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4553399Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4553742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4553870Z getattr(self, test_name)() 2022-11-23T03:12:18.4554223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4554322Z fn() 2022-11-23T03:12:18.4554680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4554804Z test(self, **param_kwargs) 2022-11-23T03:12:18.4555154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4555276Z return func(*args, **kwargs) 2022-11-23T03:12:18.4555502Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4555615Z self.run_subtests( 2022-11-23T03:12:18.4555969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4556133Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4556488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4556641Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4557008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4557126Z output = model(*input) 2022-11-23T03:12:18.4557430Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4557570Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4557944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4558166Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4558541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4558660Z _lazy_init(state, module) 2022-11-23T03:12:18.4559008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4559149Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4559465Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4559586Z return func(*args, **kwargs) 2022-11-23T03:12:18.4559959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4560065Z p_assert( 2022-11-23T03:12:18.4560398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4560577Z traceback.print_stack() 2022-11-23T03:12:18.4560706Z File "", line 1, in 2022-11-23T03:12:18.4560914Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4561037Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4561241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4561394Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4561610Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4561718Z self.run() 2022-11-23T03:12:18.4561922Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4562066Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4562178Z File "", line 1, in 2022-11-23T03:12:18.4562522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4562655Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4562862Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4563008Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4563366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4563484Z getattr(self, test_name)() 2022-11-23T03:12:18.4563676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4563807Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4564163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4564261Z fn() 2022-11-23T03:12:18.4564470Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4564574Z self.run() 2022-11-23T03:12:18.4564940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4565064Z test(self, **param_kwargs) 2022-11-23T03:12:18.4565267Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4565393Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4565754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4565877Z return func(*args, **kwargs) 2022-11-23T03:12:18.4566214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4566352Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4566600Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4566713Z self.run_subtests( 2022-11-23T03:12:18.4567126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4567276Z getattr(self, test_name)() 2022-11-23T03:12:18.4567585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4567745Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4568105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4568281Z fn() 2022-11-23T03:12:18.4568560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4568712Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4569073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4569244Z test(self, **param_kwargs) 2022-11-23T03:12:18.4569673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4569796Z output = model(*input) 2022-11-23T03:12:18.4570150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4570273Z return func(*args, **kwargs) 2022-11-23T03:12:18.4570597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4570810Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4570988Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4571083Z self.run_subtests( 2022-11-23T03:12:18.4571458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4571642Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4571994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4572155Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4572523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4572643Z _lazy_init(state, module) 2022-11-23T03:12:18.4573002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4573136Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4573481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4573626Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4573998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4574124Z output = model(*input) 2022-11-23T03:12:18.4574458Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4574584Z return func(*args, **kwargs) 2022-11-23T03:12:18.4574904Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4575119Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4575409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4575513Z p_assert( 2022-11-23T03:12:18.4575880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4576048Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4576383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4576554Z traceback.print_stack() 2022-11-23T03:12:18.4577023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4577031Z _lazy_init(state, module) 2022-11-23T03:12:18.4577383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4577529Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4577864Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4577989Z return func(*args, **kwargs) 2022-11-23T03:12:18.4578358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4578458Z p_assert( 2022-11-23T03:12:18.4578791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4578950Z traceback.print_stack() 2022-11-23T03:12:18.4579079Z File "", line 1, in 2022-11-23T03:12:18.4579285Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4579430Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4579630Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4579780Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4579992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4580078Z self.run() 2022-11-23T03:12:18.4580279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4580422Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4580765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4580908Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4581304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4581504Z getattr(self, test_name)() 2022-11-23T03:12:18.4581790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4581870Z fn() 2022-11-23T03:12:18.4582230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4582351Z test(self, **param_kwargs) 2022-11-23T03:12:18.4582700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4582825Z return func(*args, **kwargs) 2022-11-23T03:12:18.4583147Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4583185Z self.run_subtests( 2022-11-23T03:12:18.4583537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4583681Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4584375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4584546Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4584911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4585040Z output = model(*input) 2022-11-23T03:12:18.4585350Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4585493Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4585897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4586131Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4586411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4586531Z _lazy_init(state, module) 2022-11-23T03:12:18.4586876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4587021Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4587354Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4587569Z return func(*args, **kwargs) 2022-11-23T03:12:18.4587857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4587941Z p_assert( 2022-11-23T03:12:18.4588348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4588477Z traceback.print_stack() 2022-11-23T03:12:18.4588726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.4588963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.4589195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.4589421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.4589817Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4590193Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4590597Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4590990Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4591123Z File "", line 1, in 2022-11-23T03:12:18.4591335Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4591479Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4591684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4591833Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4592047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4592134Z self.run() 2022-11-23T03:12:18.4592341Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4592485Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4592833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4592969Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4593330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4593459Z getattr(self, test_name)() 2022-11-23T03:12:18.4593795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4593895Z fn() 2022-11-23T03:12:18.4594255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4594377Z test(self, **param_kwargs) 2022-11-23T03:12:18.4594729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4594848Z return func(*args, **kwargs) 2022-11-23T03:12:18.4595142Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4595260Z self.run_subtests( 2022-11-23T03:12:18.4595595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4595757Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4596119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4596273Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4596648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4596767Z output = model(*input) 2022-11-23T03:12:18.4597087Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4597275Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4597637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4597812Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4598175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4598300Z _lazy_init(state, module) 2022-11-23T03:12:18.4598651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4598795Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4599126Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4599255Z return func(*args, **kwargs) 2022-11-23T03:12:18.4599610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4599717Z p_assert( 2022-11-23T03:12:18.4600055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4600184Z traceback.print_stack() 2022-11-23T03:12:18.4600314Z File "", line 1, in 2022-11-23T03:12:18.4600523Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4600665Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4600866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4600998Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4601208Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4601318Z self.run() 2022-11-23T03:12:18.4601519Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4601669Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4602007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4602143Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4602507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4602612Z getattr(self, test_name)() 2022-11-23T03:12:18.4602968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4603066Z fn() 2022-11-23T03:12:18.4603431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4603655Z test(self, **param_kwargs) 2022-11-23T03:12:18.4603910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4604039Z return func(*args, **kwargs) 2022-11-23T03:12:18.4604313Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4604435Z self.run_subtests( 2022-11-23T03:12:18.4604790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4604954Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4605317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4605474Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4605842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4605958Z output = model(*input) 2022-11-23T03:12:18.4606319Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4606450Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4606829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4607007Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4607378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4607504Z _lazy_init(state, module) 2022-11-23T03:12:18.4607849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4607991Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4608319Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4608425Z return func(*args, **kwargs) 2022-11-23T03:12:18.4608802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4608911Z p_assert( 2022-11-23T03:12:18.4609251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4609375Z traceback.print_stack() 2022-11-23T03:12:18.4609500Z File "", line 1, in 2022-11-23T03:12:18.4609703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4609826Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4610019Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4610165Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4610381Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4610485Z self.run() 2022-11-23T03:12:18.4610690Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4610840Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4611179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4611295Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4611690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4611783Z getattr(self, test_name)() 2022-11-23T03:12:18.4612143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4612242Z fn() 2022-11-23T03:12:18.4612608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4612731Z test(self, **param_kwargs) 2022-11-23T03:12:18.4613086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4613197Z return func(*args, **kwargs) 2022-11-23T03:12:18.4613492Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4613614Z self.run_subtests( 2022-11-23T03:12:18.4613966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4614129Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4614492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4614647Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4615020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4615121Z output = model(*input) 2022-11-23T03:12:18.4615441Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4615633Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4616011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4616188Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4616548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4616670Z _lazy_init(state, module) 2022-11-23T03:12:18.4617020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4617146Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4617483Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4617611Z return func(*args, **kwargs) 2022-11-23T03:12:18.4617986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4618092Z p_assert( 2022-11-23T03:12:18.4618430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4618555Z traceback.print_stack() 2022-11-23T03:12:18.4618665Z File "", line 1, in 2022-11-23T03:12:18.4618874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4619017Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4619218Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4619367Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4619569Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4619673Z self.run() 2022-11-23T03:12:18.4619897Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4620026Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4620364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4620498Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4620838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4620961Z getattr(self, test_name)() 2022-11-23T03:12:18.4621318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4621414Z fn() 2022-11-23T03:12:18.4621772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4621893Z test(self, **param_kwargs) 2022-11-23T03:12:18.4622244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4622373Z return func(*args, **kwargs) 2022-11-23T03:12:18.4622648Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4622769Z self.run_subtests( 2022-11-23T03:12:18.4623129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4623293Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4623652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4623804Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4624491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4624595Z output = model(*input) 2022-11-23T03:12:18.4624913Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4625146Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4625514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4625706Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4626049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4626187Z _lazy_init(state, module) 2022-11-23T03:12:18.4626554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4626598Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4626931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4627037Z return func(*args, **kwargs) 2022-11-23T03:12:18.4627417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4627520Z p_assert( 2022-11-23T03:12:18.4627905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4628026Z traceback.print_stack() 2022-11-23T03:12:18.4628263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.4628498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.4628726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.4628937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.4629332Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4629470Z File "", line 1, in 2022-11-23T03:12:18.4629683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4629821Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4630020Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4630171Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4630386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4630472Z self.run() 2022-11-23T03:12:18.4630674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4630818Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4631155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4631288Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4631641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4631832Z getattr(self, test_name)() 2022-11-23T03:12:18.4632185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4632287Z fn() 2022-11-23T03:12:18.4632651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4632805Z test(self, **param_kwargs) 2022-11-23T03:12:18.4633223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4633257Z return func(*args, **kwargs) 2022-11-23T03:12:18.4633582Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4633613Z self.run_subtests( 2022-11-23T03:12:18.4633943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4634174Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4634537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4634692Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4635065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4635186Z output = model(*input) 2022-11-23T03:12:18.4635513Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4635654Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4636008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4636183Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4636555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4636678Z _lazy_init(state, module) 2022-11-23T03:12:18.4637029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4637172Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4637502Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4637626Z return func(*args, **kwargs) 2022-11-23T03:12:18.4637984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4638088Z p_assert( 2022-11-23T03:12:18.4638420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4638546Z traceback.print_stack() 2022-11-23T03:12:18.4638951Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4639352Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4639746Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4639877Z File "", line 1, in 2022-11-23T03:12:18.4640090Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4640214Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4640415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4640565Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4640774Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4640884Z self.run() 2022-11-23T03:12:18.4641135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4641288Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4641612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4641748Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4642166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4642290Z getattr(self, test_name)() 2022-11-23T03:12:18.4642645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4642744Z fn() 2022-11-23T03:12:18.4643103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4643226Z test(self, **param_kwargs) 2022-11-23T03:12:18.4643619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4643746Z return func(*args, **kwargs) 2022-11-23T03:12:18.4644027Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4644110Z self.run_subtests( 2022-11-23T03:12:18.4644462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4644623Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4644983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4645134Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4645485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4645612Z output = model(*input) 2022-11-23T03:12:18.4645938Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4646082Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4646457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4646630Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4646758Z File "", line 1, in 2022-11-23T03:12:18.4647179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4647226Z _lazy_init(state, module) 2022-11-23T03:12:18.4647604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4647717Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4647931Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4648072Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4648404Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4648606Z return func(*args, **kwargs) 2022-11-23T03:12:18.4648727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4648859Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4649235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4649340Z p_assert( 2022-11-23T03:12:18.4649552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4649651Z self.run() 2022-11-23T03:12:18.4649987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4650115Z traceback.print_stack() 2022-11-23T03:12:18.4650347Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4650501Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4650840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4650972Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4651325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4651449Z getattr(self, test_name)() 2022-11-23T03:12:18.4651804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4651896Z fn() 2022-11-23T03:12:18.4652242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4652363Z test(self, **param_kwargs) 2022-11-23T03:12:18.4652812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4652937Z return func(*args, **kwargs) 2022-11-23T03:12:18.4653187Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4653303Z self.run_subtests( 2022-11-23T03:12:18.4653655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4653818Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4654159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4654316Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4654687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4654809Z output = model(*input) 2022-11-23T03:12:18.4655134Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4655271Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4655645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4655823Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4656170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4656287Z _lazy_init(state, module) 2022-11-23T03:12:18.4656635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4656775Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4657105Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4657235Z return func(*args, **kwargs) 2022-11-23T03:12:18.4657616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4657719Z p_assert( 2022-11-23T03:12:18.4658034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4658156Z traceback.print_stack() 2022-11-23T03:12:18.4658283Z File "", line 1, in 2022-11-23T03:12:18.4658485Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4658627Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4658827Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4658978Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4659188Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4659279Z self.run() 2022-11-23T03:12:18.4659523Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4659677Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4660022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4660150Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4660510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4660633Z getattr(self, test_name)() 2022-11-23T03:12:18.4660988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4661145Z fn() 2022-11-23T03:12:18.4661429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4661596Z test(self, **param_kwargs) 2022-11-23T03:12:18.4661958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4662083Z return func(*args, **kwargs) 2022-11-23T03:12:18.4662328Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4662439Z self.run_subtests( 2022-11-23T03:12:18.4662769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4662930Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4663288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4663441Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4663814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4664199Z output = model(*input) 2022-11-23T03:12:18.4664643Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4664737Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4665037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4665214Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4665579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4665699Z _lazy_init(state, module) 2022-11-23T03:12:18.4666049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4666262Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4666520Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4666652Z return func(*args, **kwargs) 2022-11-23T03:12:18.4667030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4667115Z p_assert( 2022-11-23T03:12:18.4667451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4667668Z traceback.print_stack() 2022-11-23T03:12:18.4667830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.4668068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.4668297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.4668610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.4668996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4669118Z File "", line 1, in 2022-11-23T03:12:18.4669323Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4669514Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4669725Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4669877Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4670081Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4670183Z self.run() 2022-11-23T03:12:18.4670365Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4670510Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4670855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4671065Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4671426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4671549Z getattr(self, test_name)() 2022-11-23T03:12:18.4671907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4672007Z fn() 2022-11-23T03:12:18.4672352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4672477Z test(self, **param_kwargs) 2022-11-23T03:12:18.4672833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4672958Z return func(*args, **kwargs) 2022-11-23T03:12:18.4673207Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4673326Z self.run_subtests( 2022-11-23T03:12:18.4673679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4673842Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4674184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4674341Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4674714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4674833Z output = model(*input) 2022-11-23T03:12:18.4675156Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4675298Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4675669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4675853Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4676202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4676322Z _lazy_init(state, module) 2022-11-23T03:12:18.4676667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4676818Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4677148Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4677295Z return func(*args, **kwargs) 2022-11-23T03:12:18.4677672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4677759Z p_assert( 2022-11-23T03:12:18.4678119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4678252Z traceback.print_stack() 2022-11-23T03:12:18.4678651Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4679046Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4679441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4679570Z File "", line 1, in 2022-11-23T03:12:18.4679780Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4679925Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4680109Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4680312Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4680532Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4680639Z self.run() 2022-11-23T03:12:18.4680846Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4681063Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4681335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4681467Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4681888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4681928Z getattr(self, test_name)() 2022-11-23T03:12:18.4682284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4682379Z fn() 2022-11-23T03:12:18.4682513Z File "", line 1, in 2022-11-23T03:12:18.4682877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4683000Z test(self, **param_kwargs) 2022-11-23T03:12:18.4683348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4683486Z return func(*args, **kwargs) 2022-11-23T03:12:18.4683666Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4683806Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4684048Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4684159Z self.run_subtests( 2022-11-23T03:12:18.4684359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4684505Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4684866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4685010Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4685224Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4685331Z self.run() 2022-11-23T03:12:18.4685691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4685843Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4686051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4686196Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4686551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4686672Z output = model(*input) 2022-11-23T03:12:18.4687074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4687217Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4687547Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4687747Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4688107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4688230Z getattr(self, test_name)() 2022-11-23T03:12:18.4688585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4688760Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4689116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4689217Z fn() 2022-11-23T03:12:18.4689637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4689760Z _lazy_init(state, module) 2022-11-23T03:12:18.4690123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4690244Z test(self, **param_kwargs) 2022-11-23T03:12:18.4690572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4690717Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4691073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4691199Z return func(*args, **kwargs) 2022-11-23T03:12:18.4691534Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4691659Z return func(*args, **kwargs) 2022-11-23T03:12:18.4691911Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4692026Z self.run_subtests( 2022-11-23T03:12:18.4692382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4692488Z p_assert( 2022-11-23T03:12:18.4692836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4692996Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4693330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4693457Z traceback.print_stack() 2022-11-23T03:12:18.4693819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4693967Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4694327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4694445Z output = model(*input) 2022-11-23T03:12:18.4694766Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4694905Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4695274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4695452Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4695815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4695938Z _lazy_init(state, module) 2022-11-23T03:12:18.4696268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4696409Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4696791Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4696922Z return func(*args, **kwargs) 2022-11-23T03:12:18.4697299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4697405Z p_assert( 2022-11-23T03:12:18.4697741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4697866Z traceback.print_stack() 2022-11-23T03:12:18.4697978Z File "", line 1, in 2022-11-23T03:12:18.4698182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4698322Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4698523Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4698738Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4698951Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4699059Z self.run() 2022-11-23T03:12:18.4699262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4699389Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4699732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4699865Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4700223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4700411Z getattr(self, test_name)() 2022-11-23T03:12:18.4700709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4700812Z fn() 2022-11-23T03:12:18.4701157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4701282Z test(self, **param_kwargs) 2022-11-23T03:12:18.4701639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4701765Z return func(*args, **kwargs) 2022-11-23T03:12:18.4702014Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4702121Z self.run_subtests( 2022-11-23T03:12:18.4702469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4702630Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4702967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4703126Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4703575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4703615Z output = model(*input) 2022-11-23T03:12:18.4704280Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4704416Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4704816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4704989Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4705365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4705466Z _lazy_init(state, module) 2022-11-23T03:12:18.4705804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4705894Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4706267Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4706405Z return func(*args, **kwargs) 2022-11-23T03:12:18.4706787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4706890Z p_assert( 2022-11-23T03:12:18.4707219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4707328Z traceback.print_stack() 2022-11-23T03:12:18.4707578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.4707815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.4708046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.4708341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.4708740Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4708874Z File "", line 1, in 2022-11-23T03:12:18.4709087Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4709211Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4709409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4709561Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4709772Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4709876Z self.run() 2022-11-23T03:12:18.4710078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4710227Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4710550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4710687Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4711048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4711174Z getattr(self, test_name)() 2022-11-23T03:12:18.4711532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4711630Z fn() 2022-11-23T03:12:18.4712039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4712113Z test(self, **param_kwargs) 2022-11-23T03:12:18.4712445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4712578Z return func(*args, **kwargs) 2022-11-23T03:12:18.4712828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4712942Z self.run_subtests( 2022-11-23T03:12:18.4713293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4713457Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4713815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4714020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4714320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4714465Z output = model(*input) 2022-11-23T03:12:18.4714765Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4714958Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4715347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4715523Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4715887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4716008Z _lazy_init(state, module) 2022-11-23T03:12:18.4716339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4716484Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4716818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4716941Z return func(*args, **kwargs) 2022-11-23T03:12:18.4717378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4717480Z p_assert( 2022-11-23T03:12:18.4717809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4717936Z traceback.print_stack() 2022-11-23T03:12:18.4718315Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4718715Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4719109Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4719239Z File "", line 1, in 2022-11-23T03:12:18.4719450Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4719591Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4719794Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4719942Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4720137Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4720236Z self.run() 2022-11-23T03:12:18.4720448Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4720581Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4720917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4721049Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4721405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4721527Z getattr(self, test_name)() 2022-11-23T03:12:18.4721865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4721964Z fn() 2022-11-23T03:12:18.4722331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4722454Z test(self, **param_kwargs) 2022-11-23T03:12:18.4722807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4722933Z return func(*args, **kwargs) 2022-11-23T03:12:18.4723174Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4723289Z self.run_subtests( 2022-11-23T03:12:18.4723621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4723779Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4724192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4724349Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4724479Z File "", line 1, in 2022-11-23T03:12:18.4724855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4724975Z output = model(*input) 2022-11-23T03:12:18.4725296Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4725418Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4725631Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4725775Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4726149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4726375Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4726583Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4726733Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4727099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4727203Z _lazy_init(state, module) 2022-11-23T03:12:18.4727416Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4727521Z self.run() 2022-11-23T03:12:18.4727866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4728007Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4728207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4728349Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4728687Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4728794Z return func(*args, **kwargs) 2022-11-23T03:12:18.4729126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4729260Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4729636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4729738Z p_assert( 2022-11-23T03:12:18.4730095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4730218Z getattr(self, test_name)() 2022-11-23T03:12:18.4730535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4730662Z traceback.print_stack() 2022-11-23T03:12:18.4731027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4731126Z fn() 2022-11-23T03:12:18.4731489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4731613Z test(self, **param_kwargs) 2022-11-23T03:12:18.4731964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4732087Z return func(*args, **kwargs) 2022-11-23T03:12:18.4732313Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4732483Z self.run_subtests( 2022-11-23T03:12:18.4732780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4732939Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4733349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4733607Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4733942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4734010Z output = model(*input) 2022-11-23T03:12:18.4734313Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4734454Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4734831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4735011Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4735374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4735540Z _lazy_init(state, module) 2022-11-23T03:12:18.4735898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4736042Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4736359Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4736485Z return func(*args, **kwargs) 2022-11-23T03:12:18.4736861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4736967Z p_assert( 2022-11-23T03:12:18.4737303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4737429Z traceback.print_stack() 2022-11-23T03:12:18.4737555Z File "", line 1, in 2022-11-23T03:12:18.4737760Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.4737888Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.4738092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.4738248Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.4738462Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.4738567Z self.run() 2022-11-23T03:12:18.4738773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.4738921Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.4739260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.4739374Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.4739738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.4739861Z getattr(self, test_name)() 2022-11-23T03:12:18.4740227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.4740322Z fn() 2022-11-23T03:12:18.4740678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.4740797Z test(self, **param_kwargs) 2022-11-23T03:12:18.4741134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.4741255Z return func(*args, **kwargs) 2022-11-23T03:12:18.4741496Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T03:12:18.4741607Z self.run_subtests( 2022-11-23T03:12:18.4742014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.4742177Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.4742597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.4742752Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.4743126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.4743228Z output = model(*input) 2022-11-23T03:12:18.4743546Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.4743681Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.4744337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.4744515Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.4744795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.4744993Z _lazy_init(state, module) 2022-11-23T03:12:18.4745349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.4745475Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.4745815Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.4745943Z return func(*args, **kwargs) 2022-11-23T03:12:18.4746320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.4746424Z p_assert( 2022-11-23T03:12:18.4746756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.4746881Z traceback.print_stack() 2022-11-23T03:12:18.4747109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.4747356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.4747593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.4747823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.4748220Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4748617Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4749013Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4749404Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4749637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.4749877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.4750088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.4750477Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4750712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.4751099Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4751490Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4751874Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4752176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.4752496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.4752653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.4753025Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4753266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.4753652Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4754039Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4754477Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4754712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.4754943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.4755171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.4755560Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4755778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.4756162Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4756553Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4756940Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4757181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.4757408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.4757637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.4758022Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4758258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.4758639Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4759015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4759409Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4760162Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4760905Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4761698Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4761948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.4762182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.4762411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.4762639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.4763036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4763495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4763883Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4764254Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4764491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.4764722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.4764949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.4765177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.4765574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4765968Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4766356Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4766785Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4766961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.4767193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.4767416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.4767808Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4768051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.4768441Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4768827Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4769214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4769448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.4769725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.4769943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.4770382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4770626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.4771018Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4771404Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4771789Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4772025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.4772257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.4772539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.4773006Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4773150Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.4773531Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4773917Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4774304Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4774542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.4774784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.4775014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.4775402Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4775619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.4775999Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4776383Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4776768Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4777521Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4778261Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4779002Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4779300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.4779536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.4779766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.4779996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.4780390Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4780762Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4781154Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4781600Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4781837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.4782087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.4782415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.4782704Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4782947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.4783337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4783803Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4784445Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.4784685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.4784921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.4785155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.4785527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4785770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.4786071Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4786518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4786849Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.4787063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.4787299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.4787529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.4787913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4788202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.4788670Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4789071Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4789454Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.4789688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.4789922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.4790135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.4790522Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4790825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.4791211Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4791596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4791979Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.4792215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.4792444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.4792672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.4793048Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4793287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.4793672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4794059Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4794443Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.4795188Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4795942Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4796680Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4796922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.4797156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.4797386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.4797699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.4798084Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4798479Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4798868Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4799257Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.4799491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.4799721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.4800000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.4800390Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4800633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.4801004Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4801470Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4801784Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.4802016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.4802259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.4802487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.4802877Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4803119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.4803509Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4803896Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4804362Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.4805012Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4805752Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4806489Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4806779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.4807057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.4807251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.4807477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.4807867Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4808250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4808631Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4809069Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.4809287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.4809515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.4809740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.4810125Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4810359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.4810738Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4811122Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4811512Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.4811745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.4811956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.4812181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.4812567Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4812802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.4813185Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4813573Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4813957Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.4814702Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4815441Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4816230Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4816477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.4816711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.4816921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.4817318Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4817560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.4818008Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4818395Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4818781Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.4818892Z dist init r=1, world=4 2022-11-23T03:12:18.4819216Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4819530Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4819833Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4820128Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4820448Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4820904Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4821110Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4821407Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4821710Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4822009Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4822307Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4822600Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.4822706Z dist init r=0, world=4 2022-11-23T03:12:18.4823027Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4823384Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4823691Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4824175Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4824522Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4824785Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4825087Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4825483Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4825779Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4826077Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4826371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4826669Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.4826785Z dist init r=2, world=4 2022-11-23T03:12:18.4827088Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4827402Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4827709Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4828012Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4828313Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4828618Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4828917Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4829213Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4829513Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4829813Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4830177Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4830491Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.4830583Z dist init r=3, world=4 2022-11-23T03:12:18.4830904Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4831212Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4831516Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4831867Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4832165Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4832464Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4832760Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4833055Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4833351Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4833654Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4833991Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4834254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.4834333Z ok (8.228s) 2022-11-23T03:12:18.4834689Z test_mixture_of_experts_with_delay_before_free_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17730 2022-11-23T03:12:18.4834906Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17731 2022-11-23T03:12:18.4835127Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17732 2022-11-23T03:12:18.4835337Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17733 2022-11-23T03:12:18.4835725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4835902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4836267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4836454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4836823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4836997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4837371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4837604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4837978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4838150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4838524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4838697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4839059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4839231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4839604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4839843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4840086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.4840326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.4840563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.4840783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.4841179Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4841566Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4842004Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4842396Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4842625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.4842846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.4843067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.4843285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.4844324Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4844418Z warnings.warn( 2022-11-23T03:12:18.4845427Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4845520Z warnings.warn( 2022-11-23T03:12:18.4846575Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4846696Z warnings.warn( 2022-11-23T03:12:18.4846939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.4848029Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4848156Z warnings.warn( 2022-11-23T03:12:18.4848451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.4848658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.4848896Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.4849292Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4849686Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4850071Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4850439Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4850679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.4850920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.4851156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.4851546Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4851776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.4852158Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4852541Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4852921Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4853148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.4853377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.4853616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.4854008Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4854398Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4854641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.4855022Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4855453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4856219Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4856959Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4857709Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4858517Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4858760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.4858982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.4859221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.4859609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4859843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.4860240Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4860628Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4861012Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4861254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.4861491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.4861709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.4862103Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4862342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.4862726Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4863118Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4863501Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4863742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.4864326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.4864568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.4865036Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4865270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.4865646Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4865948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4866329Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4867098Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4867882Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4868621Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4869361Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4870156Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4870890Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4871624Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4872359Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4873087Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4873808Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4874594Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4875327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4875576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.4875819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.4876110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.4876512Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4876729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.4877124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4877513Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4877893Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4878133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.4878368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.4878610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.4879002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4879392Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4879613Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.4880002Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4880386Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4880628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.4880993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.4881228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.4881622Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4882012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4882250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.4882644Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4883015Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4883811Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4884565Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4885308Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4886116Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4886363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.4886597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.4886827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.4887222Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4887462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.4887866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4888257Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4888707Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4888862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.4889092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.4889316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.4889705Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4889951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.4890340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4890730Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4891118Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4891357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.4891579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.4891814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.4892251Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4892500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.4892893Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4893281Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4893664Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4893896Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.4894130Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.4894340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.4894775Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4895016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.4895405Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4895795Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4896183Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4896929Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4897668Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4898407Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4899144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4899880Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4900609Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4901390Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4902137Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4902863Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4903594Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4904721Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4905435Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4905698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.4905926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.4906085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.4906464Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4906858Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4907101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.4907496Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4907883Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4908121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.4908358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.4908584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.4908981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4909350Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4909589Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.4909979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4910367Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4910674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.4910916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.4911146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.4911374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.4911767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4912132Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4912518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4912977Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.4913210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.4913437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.4913664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.4914051Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4914295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.4914685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4915076Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4915452Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.4915688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.4915920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.4916151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.4916541Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4916773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.4917160Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4917551Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4917937Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.4918154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.4918388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.4918616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.4919011Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4919251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.4919686Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4920080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4920463Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.4920701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.4920913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.4921143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.4921532Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4921818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.4922209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4922593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4922979Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.4923213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.4923445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.4923671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.4924041Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4924284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.4924666Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4925050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4925438Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.4925673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.4925905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.4926136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.4926528Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4926745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.4927128Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4927512Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4927896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.4928132Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.4928361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.4928638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.4929043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4929276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.4929638Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4930025Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4930410Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.4930643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.4930960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.4931190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.4931581Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4931817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.4932199Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4932587Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4932951Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.4933148Z dist init r=3, world=4 2022-11-23T03:12:18.4933185Z dist init r=1, world=4 2022-11-23T03:12:18.4933300Z dist init r=0, world=4 2022-11-23T03:12:18.4933411Z dist init r=2, world=4 2022-11-23T03:12:18.4933512Z ok (26.169s) 2022-11-23T03:12:18.4933872Z test_mixture_of_experts_with_delay_before_free_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18511 2022-11-23T03:12:18.4934075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18512 2022-11-23T03:12:18.4934374Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18513 2022-11-23T03:12:18.4934509Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18514 2022-11-23T03:12:18.4934882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4935058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4935448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4935643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4936010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4936187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4936544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4936736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4937103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4937282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4937659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4937904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4938285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.4938460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.4938814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.4939004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.4939249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.4939491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.4939731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.4940021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.4940419Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4940811Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4941197Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4941578Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.4941788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.4942071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.4942306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.4942534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.4943549Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4943665Z warnings.warn( 2022-11-23T03:12:18.4944938Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4945053Z warnings.warn( 2022-11-23T03:12:18.4945991Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4946105Z warnings.warn( 2022-11-23T03:12:18.4946347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.4946595Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.4946893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.4947919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.4948037Z warnings.warn( 2022-11-23T03:12:18.4948274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.4948669Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4949147Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4949532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4949917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.4950161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.4950396Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.4950615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.4951006Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4951246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.4951637Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4952026Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4952408Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.4952647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.4952880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.4953112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.4953495Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4953867Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4954108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.4954497Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4954887Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.4955638Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4956431Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4957184Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4957921Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4958211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.4958455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.4958694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.4959088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4959482Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4959704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.4960098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4960486Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.4960734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.4960971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.4961208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.4961618Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4961837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.4962230Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4962598Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4962992Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.4963233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.4963467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.4963702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.4964087Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4964391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.4964706Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4965146Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4965539Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.4966267Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4967008Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4967753Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4968544Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4968788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.4969028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.4969268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.4969844Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4970162Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4970403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.4970796Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4971185Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.4971428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.4971646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.4971883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.4972280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4972514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.4972905Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4973292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4973678Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.4973921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.4974153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.4974420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.4974827Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4975224Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4975469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.4975864Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4976254Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.4977002Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4977791Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4978578Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4979373Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4979620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.4979860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.4980090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.4980301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.4980700Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4981096Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4981499Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4981891Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.4982202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.4982364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.4982595Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.4983067Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4983251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.4983649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4984330Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4984698Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.4984982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.4985245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.4985445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.4985889Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4986209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.4986608Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4986956Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4987271Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.4988016Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4988260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.4988504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.4988739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.4989133Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4989524Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4989767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.4990158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4990529Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.4991279Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4992028Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4992765Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4993071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.4993317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.4993547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.4993779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.4994179Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4994570Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4994960Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4995405Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.4996151Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4996374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.4996610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.4996844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.4997234Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4997480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.4997874Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4998266Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4998661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.4999401Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.4999648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.4999874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.5000108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.5000507Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5000897Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5001134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.5001521Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5001907Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5002261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.5002473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.5002705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.5003145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5003437Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5003772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.5004172Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5004461Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5005257Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5005501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.5005737Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.5005969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.5006360Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5006582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.5006979Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5007370Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5007761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5007996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.5008230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.5008462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.5008851Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5009248Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5009469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.5009856Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5010246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5010989Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5011775Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5012027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.5012262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.5012492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.5012721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.5013198Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5013580Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5013951Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5014338Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5014572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.5014811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.5015041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.5015278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.5015657Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5016058Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5016448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5016838Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5017054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.5017287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.5017516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.5017744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.5018141Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5018529Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5018915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5019298Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5019532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.5019743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.5019973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.5020412Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5020665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.5021052Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5021436Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5021822Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5022572Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5022864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.5023104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.5023317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.5023708Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5024430Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5024703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.5025098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5025482Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5025586Z dist init r=0, world=4 2022-11-23T03:12:18.5025687Z dist init r=3, world=4 2022-11-23T03:12:18.5025831Z dist init r=2, world=4 2022-11-23T03:12:18.5025916Z dist init r=1, world=4 2022-11-23T03:12:18.5025997Z ok (30.076s) 2022-11-23T03:12:18.5026394Z test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19292 2022-11-23T03:12:18.5026610Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19293 2022-11-23T03:12:18.5026817Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19294 2022-11-23T03:12:18.5027020Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19295 2022-11-23T03:12:18.5027348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5027528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5027890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5028082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5028453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5028631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5029008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5029196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5029560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5029811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5030183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5030373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5030741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5030918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5031293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5031485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5031730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.5032043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.5032287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.5032508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.5032909Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5033303Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5033690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5034075Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5034312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.5034543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.5034770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.5034994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.5036014Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5036130Z warnings.warn( 2022-11-23T03:12:18.5036361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.5037364Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5037482Z warnings.warn( 2022-11-23T03:12:18.5037721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.5038772Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5038903Z warnings.warn( 2022-11-23T03:12:18.5039918Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5040026Z warnings.warn( 2022-11-23T03:12:18.5040271Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.5040511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.5040963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5041358Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5041728Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5042169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5042415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.5042653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.5042891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.5043290Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5043683Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5043919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.5044310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5044699Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5044919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.5045157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.5045404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.5045795Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5046079Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.5046421Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5046809Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5047192Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5047991Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5048748Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5049491Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5050235Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5050526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.5050746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.5050983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.5051377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5051850Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5052013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.5052410Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5052801Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5053041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.5053279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.5053496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.5053888Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5054124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.5054512Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5054909Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5055292Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5055532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.5055765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.5056003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.5056371Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5056607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.5057041Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5057440Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5057820Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5058564Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5059302Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5060089Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5060823Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5061065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.5061305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.5061549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.5061938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5062155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.5062546Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5062941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5063328Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5063567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.5063807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.5064508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.5065020Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5065433Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5065674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.5066052Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5066434Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5066753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.5066931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.5067246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.5067562Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5067807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.5068205Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5068598Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5069050Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5069831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5070582Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5071328Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5072073Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5072319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.5072559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.5072790Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.5073272Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5073517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.5073912Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5074303Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5074693Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5074910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.5075143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.5075374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.5075768Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5076099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.5076503Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5076988Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5077382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5077619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.5077837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.5078023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.5078503Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5078924Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5079085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.5079465Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5079848Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5080596Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5080844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.5081080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.5081313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.5081685Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5081926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.5082317Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5082707Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5083101Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5083877Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5084587Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5085375Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5085629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.5085867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.5086101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.5086497Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5086718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.5087112Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5087555Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5087948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5088750Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5088995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.5089229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.5089551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.5089874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5090112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.5090484Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5090878Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5091267Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5092004Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5092252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.5092483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.5092714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.5092940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.5093332Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5093721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5094091Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5094535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5094778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.5095012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.5095243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.5095632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5095871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.5096260Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5096706Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5097095Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5097816Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5098061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.5098315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.5098531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.5098930Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5099320Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5099559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.5099943Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5100331Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5100570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.5100783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.5101020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.5101409Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5101795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5102041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.5102428Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5102820Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5103605Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5104672Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5104967Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.5105188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.5105412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.5105818Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5106145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.5106527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5106938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5107304Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5107474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.5107708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.5107938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.5108314Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5108704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5108944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.5109330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5109716Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5109954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.5110187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.5110421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.5110811Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5111049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.5111415Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5111804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5112191Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5112428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.5112731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.5112972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.5113363Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5113619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.5113991Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5114358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5114744Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5115547Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5115792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.5116030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.5116264Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.5116652Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5116890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.5117284Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5117677Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5118066Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5118163Z dist init r=0, world=4 2022-11-23T03:12:18.5118273Z dist init r=3, world=4 2022-11-23T03:12:18.5118385Z dist init r=2, world=4 2022-11-23T03:12:18.5118494Z dist init r=1, world=4 2022-11-23T03:12:18.5118596Z ok (26.469s) 2022-11-23T03:12:18.5118960Z test_mixture_of_experts_with_delay_before_free_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20073 2022-11-23T03:12:18.5119183Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20074 2022-11-23T03:12:18.5119385Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20075 2022-11-23T03:12:18.5119603Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20076 2022-11-23T03:12:18.5119980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5120162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5120585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5120795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5121166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5121342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5121699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5121949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5122404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5122509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5122883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5123076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5123442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5123615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5123987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5124209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5124454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.5124698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.5124940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.5125184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.5125585Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5125983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5126367Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5126758Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5126969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.5127201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.5127427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.5127662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.5128682Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5128806Z warnings.warn( 2022-11-23T03:12:18.5129821Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5129935Z warnings.warn( 2022-11-23T03:12:18.5130994Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5131118Z warnings.warn( 2022-11-23T03:12:18.5131368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.5131610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.5132615Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5132773Z warnings.warn( 2022-11-23T03:12:18.5132993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.5133234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.5133630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5133764Z File "", line 1, in 2022-11-23T03:12:18.5133986Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5134131Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5134332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5134485Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5134680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5134792Z self.run() 2022-11-23T03:12:18.5135000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5135222Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5135564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5135701Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5136069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5136175Z getattr(self, test_name)() 2022-11-23T03:12:18.5136540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5136643Z fn() 2022-11-23T03:12:18.5137009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5137135Z test(self, **param_kwargs) 2022-11-23T03:12:18.5137501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5137628Z return func(*args, **kwargs) 2022-11-23T03:12:18.5137910Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5138005Z self.run_subtests( 2022-11-23T03:12:18.5138359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5138525Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5138895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5139054Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5139431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5139556Z output = model(*input) 2022-11-23T03:12:18.5139934Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5140061Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5140438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5140617Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5140986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5141111Z _lazy_init(state, module) 2022-11-23T03:12:18.5141462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5141608Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5141945Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5142201Z return func(*args, **kwargs) 2022-11-23T03:12:18.5142569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5142680Z p_assert( 2022-11-23T03:12:18.5143016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5143146Z traceback.print_stack() 2022-11-23T03:12:18.5143544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5144263Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5144770Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5144845Z File "", line 1, in 2022-11-23T03:12:18.5145001Z File "", line 1, in 2022-11-23T03:12:18.5145135Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5145285Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5145494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5145737Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5145864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5146109Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5146204Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5146313Z self.run() 2022-11-23T03:12:18.5146517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5146668Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5146877Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5147030Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5147244Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5147351Z self.run() 2022-11-23T03:12:18.5147676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5147809Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5148012Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5148161Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5148526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5148651Z getattr(self, test_name)() 2022-11-23T03:12:18.5148990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5149129Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5149545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5149659Z fn() 2022-11-23T03:12:18.5150027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5150154Z getattr(self, test_name)() 2022-11-23T03:12:18.5150518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5150640Z test(self, **param_kwargs) 2022-11-23T03:12:18.5150995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5151094Z fn() 2022-11-23T03:12:18.5151432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5151559Z return func(*args, **kwargs) 2022-11-23T03:12:18.5151994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5152115Z test(self, **param_kwargs) 2022-11-23T03:12:18.5152394Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5152508Z self.run_subtests( 2022-11-23T03:12:18.5152867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5152972Z return func(*args, **kwargs) 2022-11-23T03:12:18.5153324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5153490Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5153771Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5153896Z self.run_subtests( 2022-11-23T03:12:18.5154262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5154417Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5154771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5154934Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5155289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5155414Z output = model(*input) 2022-11-23T03:12:18.5155776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5155933Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5156257Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5156409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5156789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5156909Z output = model(*input) 2022-11-23T03:12:18.5157265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5157441Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5157769Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5157911Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5158278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5158402Z _lazy_init(state, module) 2022-11-23T03:12:18.5158829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5159016Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5159350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5159496Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5159862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5159985Z _lazy_init(state, module) 2022-11-23T03:12:18.5160323Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5160450Z return func(*args, **kwargs) 2022-11-23T03:12:18.5160795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5160988Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5161351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5161459Z p_assert( 2022-11-23T03:12:18.5161801Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5161930Z return func(*args, **kwargs) 2022-11-23T03:12:18.5162266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5162391Z traceback.print_stack() 2022-11-23T03:12:18.5162772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5162875Z p_assert( 2022-11-23T03:12:18.5163184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5163311Z traceback.print_stack() 2022-11-23T03:12:18.5163443Z File "", line 1, in 2022-11-23T03:12:18.5163660Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5163807Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5164015Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5164170Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5164362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5164468Z self.run() 2022-11-23T03:12:18.5164670Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5164819Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5165161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5165297Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5165660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5165791Z getattr(self, test_name)() 2022-11-23T03:12:18.5166132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5166234Z fn() 2022-11-23T03:12:18.5166603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5166731Z test(self, **param_kwargs) 2022-11-23T03:12:18.5167090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5167217Z return func(*args, **kwargs) 2022-11-23T03:12:18.5167596Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5167614Z self.run_subtests( 2022-11-23T03:12:18.5167948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5168164Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5168543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5168698Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5169168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5169202Z output = model(*input) 2022-11-23T03:12:18.5169529Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5169673Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5170090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5170270Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5170744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5170826Z _lazy_init(state, module) 2022-11-23T03:12:18.5171175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5171322Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5171654Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5171779Z return func(*args, **kwargs) 2022-11-23T03:12:18.5172138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5172245Z p_assert( 2022-11-23T03:12:18.5172587Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5172721Z traceback.print_stack() 2022-11-23T03:12:18.5172970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.5173218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.5173460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.5173702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.5174081Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5174212Z File "", line 1, in 2022-11-23T03:12:18.5174427Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5174575Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5174784Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5174941Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5175156Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5175265Z self.run() 2022-11-23T03:12:18.5175450Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5175598Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5175942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5176075Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5176439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5176565Z getattr(self, test_name)() 2022-11-23T03:12:18.5176926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5177027Z fn() 2022-11-23T03:12:18.5177417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5177556Z test(self, **param_kwargs) 2022-11-23T03:12:18.5177919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5178047Z return func(*args, **kwargs) 2022-11-23T03:12:18.5178412Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5178445Z self.run_subtests( 2022-11-23T03:12:18.5178801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5178964Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5179405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5179587Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5179897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5180017Z output = model(*input) 2022-11-23T03:12:18.5180344Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5180489Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5180864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5181041Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5181386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5181506Z _lazy_init(state, module) 2022-11-23T03:12:18.5181859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5182013Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5182351Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5182476Z return func(*args, **kwargs) 2022-11-23T03:12:18.5182857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5183061Z p_assert( 2022-11-23T03:12:18.5183280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5183408Z traceback.print_stack() 2022-11-23T03:12:18.5183901Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5184574Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5184975Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5185119Z File "", line 1, in 2022-11-23T03:12:18.5185332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5185495Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5185668Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5185835Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5185952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5186061Z self.run() 2022-11-23T03:12:18.5186265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5186414Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5186756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5186965Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5187322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5187449Z getattr(self, test_name)() 2022-11-23T03:12:18.5187812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5187911Z fn() 2022-11-23T03:12:18.5188278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5188405Z test(self, **param_kwargs) 2022-11-23T03:12:18.5188762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5188890Z return func(*args, **kwargs) 2022-11-23T03:12:18.5189149Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5189336Z self.run_subtests( 2022-11-23T03:12:18.5189696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5189940Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5190226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5190380Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5190756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5190877Z output = model(*input) 2022-11-23T03:12:18.5191183Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5191324Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5191708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5191885Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5192256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5192381Z _lazy_init(state, module) 2022-11-23T03:12:18.5192733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5192878Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5193198Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5193328Z return func(*args, **kwargs) 2022-11-23T03:12:18.5193709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5193816Z p_assert( 2022-11-23T03:12:18.5194158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5194291Z traceback.print_stack() 2022-11-23T03:12:18.5194425Z File "", line 1, in 2022-11-23T03:12:18.5194638Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5194762Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5194968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5195122Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5195336Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5195442Z self.run() 2022-11-23T03:12:18.5195647Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5195796Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5196118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5196299Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5196672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5196797Z getattr(self, test_name)() 2022-11-23T03:12:18.5197153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5197253Z fn() 2022-11-23T03:12:18.5197617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5197742Z test(self, **param_kwargs) 2022-11-23T03:12:18.5198078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5198205Z return func(*args, **kwargs) 2022-11-23T03:12:18.5198491Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5198655Z self.run_subtests( 2022-11-23T03:12:18.5199015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5199182Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5199547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5199701Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5200057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5200180Z output = model(*input) 2022-11-23T03:12:18.5200508Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5200653Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5201036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5201213Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5201580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5201705Z _lazy_init(state, module) 2022-11-23T03:12:18.5202034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5202180Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5202517Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5202706Z return func(*args, **kwargs) 2022-11-23T03:12:18.5203021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5203128Z p_assert( 2022-11-23T03:12:18.5203471Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5203599Z traceback.print_stack() 2022-11-23T03:12:18.5203710Z File "", line 1, in 2022-11-23T03:12:18.5203920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5204065Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5204270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5204421Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5204635Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5204742Z self.run() 2022-11-23T03:12:18.5204926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5205075Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5205472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5205618Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5205983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5206109Z getattr(self, test_name)() 2022-11-23T03:12:18.5206467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5206567Z fn() 2022-11-23T03:12:18.5206910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5207037Z test(self, **param_kwargs) 2022-11-23T03:12:18.5207464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5207525Z return func(*args, **kwargs) 2022-11-23T03:12:18.5207872Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5207987Z self.run_subtests( 2022-11-23T03:12:18.5208342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5208589Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5208849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5209005Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5209382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5209503Z output = model(*input) 2022-11-23T03:12:18.5209832Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5209977Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5210355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5210534Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5210899Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5211005Z _lazy_init(state, module) 2022-11-23T03:12:18.5211361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5211506Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5211843Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5211959Z return func(*args, **kwargs) 2022-11-23T03:12:18.5222383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5222531Z p_assert( 2022-11-23T03:12:18.5222932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5223047Z traceback.print_stack() 2022-11-23T03:12:18.5223299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.5223537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.5223767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.5224329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.5224784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5225180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5225376Z File "", line 1, in 2022-11-23T03:12:18.5225670Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5225811Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5226007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5226184Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5226469Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5226575Z self.run() 2022-11-23T03:12:18.5226758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5226898Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5227210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5227384Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5227742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5227849Z getattr(self, test_name)() 2022-11-23T03:12:18.5228200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5228287Z fn() 2022-11-23T03:12:18.5228641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5228751Z test(self, **param_kwargs) 2022-11-23T03:12:18.5229096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5229209Z return func(*args, **kwargs) 2022-11-23T03:12:18.5229485Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5229585Z self.run_subtests( 2022-11-23T03:12:18.5229945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5230118Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5230487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5230638Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5231013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5231135Z output = model(*input) 2022-11-23T03:12:18.5231458Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5231581Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5231960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5232146Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5232517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5232645Z _lazy_init(state, module) 2022-11-23T03:12:18.5232993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5233140Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5233538Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5233597Z return func(*args, **kwargs) 2022-11-23T03:12:18.5233958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5234063Z p_assert( 2022-11-23T03:12:18.5234399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5234586Z traceback.print_stack() 2022-11-23T03:12:18.5234730Z File "", line 1, in 2022-11-23T03:12:18.5234948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5235189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5235278Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5235429Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5235643Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5235751Z self.run() 2022-11-23T03:12:18.5236027Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5236177Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5236529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5236713Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5237061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5237188Z getattr(self, test_name)() 2022-11-23T03:12:18.5237550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5237653Z fn() 2022-11-23T03:12:18.5238019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5238142Z test(self, **param_kwargs) 2022-11-23T03:12:18.5238566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5238621Z return func(*args, **kwargs) 2022-11-23T03:12:18.5238882Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5239001Z self.run_subtests( 2022-11-23T03:12:18.5239358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5239522Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5239890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5240044Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5240419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5240541Z output = model(*input) 2022-11-23T03:12:18.5240847Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5240989Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5241370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5241555Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5241925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5242107Z _lazy_init(state, module) 2022-11-23T03:12:18.5242468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5242614Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5242933Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5243059Z return func(*args, **kwargs) 2022-11-23T03:12:18.5243441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5243543Z p_assert( 2022-11-23T03:12:18.5243936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5244074Z traceback.print_stack() 2022-11-23T03:12:18.5244479Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5244879Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5244992Z File "", line 1, in 2022-11-23T03:12:18.5245206Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5245353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5245558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5245709Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5245921Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5246165Z self.run() 2022-11-23T03:12:18.5246293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5246420Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5246766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5246900Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5247260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5247384Z getattr(self, test_name)() 2022-11-23T03:12:18.5247743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5247843Z fn() 2022-11-23T03:12:18.5248187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5248312Z test(self, **param_kwargs) 2022-11-23T03:12:18.5248673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5248799Z return func(*args, **kwargs) 2022-11-23T03:12:18.5249077Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5249196Z self.run_subtests( 2022-11-23T03:12:18.5249651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5249714Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5250055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5250306Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5250580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5250706Z output = model(*input) 2022-11-23T03:12:18.5250837Z File "", line 1, in 2022-11-23T03:12:18.5251160Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5251301Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5251681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5251839Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5252048Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5252189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5252554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5252678Z _lazy_init(state, module) 2022-11-23T03:12:18.5252883Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5253087Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5253451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5253575Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5253789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5253896Z self.run() 2022-11-23T03:12:18.5254236Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5254363Z return func(*args, **kwargs) 2022-11-23T03:12:18.5254563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5254709Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5255087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5255221Z p_assert( 2022-11-23T03:12:18.5255562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5255694Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5256030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5256158Z traceback.print_stack() 2022-11-23T03:12:18.5256513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5256632Z getattr(self, test_name)() 2022-11-23T03:12:18.5256987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5257067Z fn() 2022-11-23T03:12:18.5257425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5257555Z test(self, **param_kwargs) 2022-11-23T03:12:18.5257919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5258045Z return func(*args, **kwargs) 2022-11-23T03:12:18.5258325Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5258439Z self.run_subtests( 2022-11-23T03:12:18.5258792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5258936Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5259302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5259456Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5259835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5259963Z output = model(*input) 2022-11-23T03:12:18.5260292Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5260433Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5260807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5260963Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5261324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5261445Z _lazy_init(state, module) 2022-11-23T03:12:18.5261795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5262024Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5262280Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5262457Z return func(*args, **kwargs) 2022-11-23T03:12:18.5262847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5262932Z p_assert( 2022-11-23T03:12:18.5263272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5263402Z traceback.print_stack() 2022-11-23T03:12:18.5263649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.5264176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.5264434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.5264774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.5265243Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5265327Z File "", line 1, in 2022-11-23T03:12:18.5265589Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5265666Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5265931Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5266066Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5266293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5266399Z self.run() 2022-11-23T03:12:18.5266610Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5266728Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5266989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5267121Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5267480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5267626Z getattr(self, test_name)() 2022-11-23T03:12:18.5267956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5268052Z fn() 2022-11-23T03:12:18.5268416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5268522Z test(self, **param_kwargs) 2022-11-23T03:12:18.5268893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5269001Z return func(*args, **kwargs) 2022-11-23T03:12:18.5269274Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5269392Z self.run_subtests( 2022-11-23T03:12:18.5269804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5269975Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5270351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5270486Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5270856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5270970Z output = model(*input) 2022-11-23T03:12:18.5271290Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5271427Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5271879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5272067Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5272436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5272538Z _lazy_init(state, module) 2022-11-23T03:12:18.5272886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5273027Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5273359Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5273481Z return func(*args, **kwargs) 2022-11-23T03:12:18.5273851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5274007Z p_assert( 2022-11-23T03:12:18.5274350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5274458Z traceback.print_stack() 2022-11-23T03:12:18.5274854Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5275249Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5275637Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5276059Z File "", line 1, in 2022-11-23T03:12:18.5276196Z File "", line 1, in 2022-11-23T03:12:18.5276407Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5276550Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5276740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5276893Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5277099Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5277240Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5277536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5277549Z self.run() 2022-11-23T03:12:18.5277742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5277873Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5278076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5278216Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5278426Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5278537Z self.run() 2022-11-23T03:12:18.5278891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5279028Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5279226Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5279398Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5279721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5279847Z getattr(self, test_name)() 2022-11-23T03:12:18.5280175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5280305Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5280664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5280754Z fn() 2022-11-23T03:12:18.5281174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5281286Z getattr(self, test_name)() 2022-11-23T03:12:18.5281652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5281777Z test(self, **param_kwargs) 2022-11-23T03:12:18.5282123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5282220Z fn() 2022-11-23T03:12:18.5282572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5282689Z return func(*args, **kwargs) 2022-11-23T03:12:18.5283147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5283252Z test(self, **param_kwargs) 2022-11-23T03:12:18.5283632Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5283720Z self.run_subtests( 2022-11-23T03:12:18.5284081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5284297Z return func(*args, **kwargs) 2022-11-23T03:12:18.5284553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5284711Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5284982Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5285077Z self.run_subtests( 2022-11-23T03:12:18.5285428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5285582Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5285928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5286080Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5286450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5286567Z output = model(*input) 2022-11-23T03:12:18.5286925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5287058Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5287373Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5287513Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5287885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5288007Z output = model(*input) 2022-11-23T03:12:18.5288381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5288556Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5288882Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5289003Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5289361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5289481Z _lazy_init(state, module) 2022-11-23T03:12:18.5289854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5290025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5290426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5290572Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5290937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5291039Z _lazy_init(state, module) 2022-11-23T03:12:18.5291370Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5291491Z return func(*args, **kwargs) 2022-11-23T03:12:18.5291835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5291971Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5292345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5292440Z p_assert( 2022-11-23T03:12:18.5292828Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5292934Z return func(*args, **kwargs) 2022-11-23T03:12:18.5293262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5293382Z traceback.print_stack() 2022-11-23T03:12:18.5293758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5293861Z p_assert( 2022-11-23T03:12:18.5294194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5294321Z traceback.print_stack() 2022-11-23T03:12:18.5294445Z File "", line 1, in 2022-11-23T03:12:18.5294636Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5294780Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5294983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5295129Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5295332Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5295433Z self.run() 2022-11-23T03:12:18.5295630Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5295756Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5296090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5296214Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5296574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5296692Z getattr(self, test_name)() 2022-11-23T03:12:18.5297046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5297141Z fn() 2022-11-23T03:12:18.5297505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5297609Z test(self, **param_kwargs) 2022-11-23T03:12:18.5297961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5298078Z return func(*args, **kwargs) 2022-11-23T03:12:18.5298348Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5298462Z self.run_subtests( 2022-11-23T03:12:18.5298806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5298965Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5299324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5299510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5299894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5300016Z output = model(*input) 2022-11-23T03:12:18.5300334Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5300464Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5300828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5300998Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5301355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5301456Z _lazy_init(state, module) 2022-11-23T03:12:18.5301858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5302001Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5302334Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5302453Z return func(*args, **kwargs) 2022-11-23T03:12:18.5302831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5303000Z p_assert( 2022-11-23T03:12:18.5303276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5303384Z traceback.print_stack() 2022-11-23T03:12:18.5303630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.5304139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.5304401Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.5304776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.5305282Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5305406Z File "", line 1, in 2022-11-23T03:12:18.5305609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5305735Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5305951Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5306106Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5306268Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5306392Z self.run() 2022-11-23T03:12:18.5306521Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5306758Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5307003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5307118Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5307479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5307600Z getattr(self, test_name)() 2022-11-23T03:12:18.5307960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5308060Z fn() 2022-11-23T03:12:18.5308425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5308548Z test(self, **param_kwargs) 2022-11-23T03:12:18.5308982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5309101Z return func(*args, **kwargs) 2022-11-23T03:12:18.5309378Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5309493Z self.run_subtests( 2022-11-23T03:12:18.5309850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5310011Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5310373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5310533Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5310907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5311073Z output = model(*input) 2022-11-23T03:12:18.5311399Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5311542Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5311911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5312081Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5312441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5312561Z _lazy_init(state, module) 2022-11-23T03:12:18.5312910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5313035Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5313370Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5313495Z return func(*args, **kwargs) 2022-11-23T03:12:18.5313875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5313975Z p_assert( 2022-11-23T03:12:18.5314309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5314451Z traceback.print_stack() 2022-11-23T03:12:18.5314892Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5315203Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5315583Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5315722Z File "", line 1, in 2022-11-23T03:12:18.5315926Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5316070Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5316269Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5316417Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5316628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5316714Z self.run() 2022-11-23T03:12:18.5316912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5317056Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5317390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5317517Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5317873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5317998Z getattr(self, test_name)() 2022-11-23T03:12:18.5318396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5318482Z fn() 2022-11-23T03:12:18.5318848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5318972Z test(self, **param_kwargs) 2022-11-23T03:12:18.5319326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5319449Z return func(*args, **kwargs) 2022-11-23T03:12:18.5319728Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5319839Z self.run_subtests( 2022-11-23T03:12:18.5320189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5320385Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5320749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5320898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5321273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5321393Z output = model(*input) 2022-11-23T03:12:18.5321713Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5321856Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5322225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5322382Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5322747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5322862Z _lazy_init(state, module) 2022-11-23T03:12:18.5323266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5323354Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5323484Z File "", line 1, in 2022-11-23T03:12:18.5323812Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5323930Z return func(*args, **kwargs) 2022-11-23T03:12:18.5324121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5324258Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5324633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5324737Z p_assert( 2022-11-23T03:12:18.5324933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5325077Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5325409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5325535Z traceback.print_stack() 2022-11-23T03:12:18.5325730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5325834Z self.run() 2022-11-23T03:12:18.5326031Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5326171Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5326508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5326638Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5327093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5327152Z getattr(self, test_name)() 2022-11-23T03:12:18.5327556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5327612Z fn() 2022-11-23T03:12:18.5327970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5328089Z test(self, **param_kwargs) 2022-11-23T03:12:18.5328440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5328562Z return func(*args, **kwargs) 2022-11-23T03:12:18.5328837Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5328933Z self.run_subtests( 2022-11-23T03:12:18.5329281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5329493Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5329854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5330000Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5330374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5330489Z output = model(*input) 2022-11-23T03:12:18.5330815Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5330938Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5331063Z File "", line 1, in 2022-11-23T03:12:18.5331433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5331608Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5331971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5332086Z _lazy_init(state, module) 2022-11-23T03:12:18.5332294Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5332434Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5332763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5332901Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5333101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5333247Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5333575Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5333701Z return func(*args, **kwargs) 2022-11-23T03:12:18.5333917Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5334018Z self.run() 2022-11-23T03:12:18.5334378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5334481Z p_assert( 2022-11-23T03:12:18.5334686Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5334832Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5335161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5335287Z traceback.print_stack() 2022-11-23T03:12:18.5335623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5335753Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5336152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5336386Z getattr(self, test_name)() 2022-11-23T03:12:18.5336642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5336733Z fn() 2022-11-23T03:12:18.5337096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5337219Z test(self, **param_kwargs) 2022-11-23T03:12:18.5337573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5337680Z return func(*args, **kwargs) 2022-11-23T03:12:18.5337954Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5338064Z self.run_subtests( 2022-11-23T03:12:18.5338468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5338630Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5338998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5339154Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5339525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5339644Z output = model(*input) 2022-11-23T03:12:18.5339952Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5340089Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5340460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5340641Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5341007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5341123Z _lazy_init(state, module) 2022-11-23T03:12:18.5341473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5341624Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5341942Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5342123Z return func(*args, **kwargs) 2022-11-23T03:12:18.5342513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5342613Z p_assert( 2022-11-23T03:12:18.5342948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5343075Z traceback.print_stack() 2022-11-23T03:12:18.5343319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.5343564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.5343785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.5344399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.5344916Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5345050Z File "", line 1, in 2022-11-23T03:12:18.5345240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5345401Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5345602Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5345821Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5346042Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5346066Z self.run() 2022-11-23T03:12:18.5346255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5346488Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5346737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5346876Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5347234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5347338Z getattr(self, test_name)() 2022-11-23T03:12:18.5347696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5347880Z fn() 2022-11-23T03:12:18.5348245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5348361Z test(self, **param_kwargs) 2022-11-23T03:12:18.5348711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5348830Z return func(*args, **kwargs) 2022-11-23T03:12:18.5349105Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5349200Z self.run_subtests( 2022-11-23T03:12:18.5349555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5349723Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5350085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5350243Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5350617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5350767Z output = model(*input) 2022-11-23T03:12:18.5351057Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5351180Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5351551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5351727Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5352162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5352210Z _lazy_init(state, module) 2022-11-23T03:12:18.5352553Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5352702Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5353038Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5353145Z return func(*args, **kwargs) 2022-11-23T03:12:18.5353525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5353629Z p_assert( 2022-11-23T03:12:18.5353960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5354082Z traceback.print_stack() 2022-11-23T03:12:18.5354475Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5354868Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5355303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5355444Z File "", line 1, in 2022-11-23T03:12:18.5355635Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5355770Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5355968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5356125Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5356335Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5356439Z self.run() 2022-11-23T03:12:18.5356642Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5356770Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5357111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5357296Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5357421Z File "", line 1, in 2022-11-23T03:12:18.5357778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5357897Z getattr(self, test_name)() 2022-11-23T03:12:18.5358098Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5358239Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5358578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5358677Z fn() 2022-11-23T03:12:18.5358872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5359022Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5359381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5359510Z test(self, **param_kwargs) 2022-11-23T03:12:18.5359722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5359828Z self.run() 2022-11-23T03:12:18.5360170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5360289Z return func(*args, **kwargs) 2022-11-23T03:12:18.5360487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5360625Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5360896Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5361004Z self.run_subtests( 2022-11-23T03:12:18.5361336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5361453Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5361803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5361958Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5362310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5362426Z getattr(self, test_name)() 2022-11-23T03:12:18.5362846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5362929Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5363277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5363369Z fn() 2022-11-23T03:12:18.5363724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5363890Z output = model(*input) 2022-11-23T03:12:18.5364264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5364382Z test(self, **param_kwargs) 2022-11-23T03:12:18.5364697Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5364835Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5365188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5365295Z return func(*args, **kwargs) 2022-11-23T03:12:18.5365666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5365835Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5366163Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5366274Z self.run_subtests( 2022-11-23T03:12:18.5366635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5366750Z _lazy_init(state, module) 2022-11-23T03:12:18.5367096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5367254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5367589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5367728Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5368084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5368237Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5368572Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5368695Z return func(*args, **kwargs) 2022-11-23T03:12:18.5369063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5369248Z output = model(*input) 2022-11-23T03:12:18.5369547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5369646Z p_assert( 2022-11-23T03:12:18.5370026Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5370257Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5370510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5370637Z traceback.print_stack() 2022-11-23T03:12:18.5370767Z File "", line 1, in 2022-11-23T03:12:18.5371127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5371298Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5371659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5371779Z _lazy_init(state, module) 2022-11-23T03:12:18.5371983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5372119Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5372464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5372605Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5372788Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5373021Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5373364Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5373485Z return func(*args, **kwargs) 2022-11-23T03:12:18.5373694Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5373836Z self.run() 2022-11-23T03:12:18.5374166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5374270Z p_assert( 2022-11-23T03:12:18.5374455Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5374597Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5374928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5375100Z traceback.print_stack() 2022-11-23T03:12:18.5375442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5375572Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5375927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5376046Z getattr(self, test_name)() 2022-11-23T03:12:18.5376383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5376482Z fn() 2022-11-23T03:12:18.5376847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5376967Z test(self, **param_kwargs) 2022-11-23T03:12:18.5377316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5377439Z return func(*args, **kwargs) 2022-11-23T03:12:18.5377721Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5377835Z self.run_subtests( 2022-11-23T03:12:18.5378167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5378320Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5378674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5378825Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5379200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5379319Z output = model(*input) 2022-11-23T03:12:18.5379642Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5379794Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5380150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5380423Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5380689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5380805Z _lazy_init(state, module) 2022-11-23T03:12:18.5381148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5381287Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5381623Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5381741Z return func(*args, **kwargs) 2022-11-23T03:12:18.5382098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5382253Z p_assert( 2022-11-23T03:12:18.5382595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5382717Z traceback.print_stack() 2022-11-23T03:12:18.5382958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.5383196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.5383433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.5383668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.5384341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5384830Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5385068Z File "", line 1, in 2022-11-23T03:12:18.5385277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5385411Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5385623Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5385769Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5385978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5386015Z self.run() 2022-11-23T03:12:18.5386185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5386327Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5386671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5386810Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5387172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5387302Z getattr(self, test_name)() 2022-11-23T03:12:18.5387649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5387728Z fn() 2022-11-23T03:12:18.5388091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5388269Z test(self, **param_kwargs) 2022-11-23T03:12:18.5388630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5388755Z return func(*args, **kwargs) 2022-11-23T03:12:18.5389032Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5389148Z self.run_subtests( 2022-11-23T03:12:18.5389501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5389646Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5390006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5390158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5390528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5390648Z output = model(*input) 2022-11-23T03:12:18.5390971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5391109Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5391483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5391709Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5392090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5392208Z _lazy_init(state, module) 2022-11-23T03:12:18.5392557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5392701Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5393034Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5393155Z return func(*args, **kwargs) 2022-11-23T03:12:18.5393534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5393618Z p_assert( 2022-11-23T03:12:18.5393951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5394127Z traceback.print_stack() 2022-11-23T03:12:18.5394257Z File "", line 1, in 2022-11-23T03:12:18.5394465Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5394604Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5394841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5394989Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5395183Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5395285Z self.run() 2022-11-23T03:12:18.5395485Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5395628Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5395963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5396101Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5396463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5396568Z getattr(self, test_name)() 2022-11-23T03:12:18.5396921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5397016Z fn() 2022-11-23T03:12:18.5397374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5397493Z test(self, **param_kwargs) 2022-11-23T03:12:18.5397841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5397960Z return func(*args, **kwargs) 2022-11-23T03:12:18.5398237Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5398338Z self.run_subtests( 2022-11-23T03:12:18.5398691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5398855Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5399216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5399367Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5399734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5399848Z output = model(*input) 2022-11-23T03:12:18.5400223Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5400288Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5400658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5400880Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5401248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5401366Z _lazy_init(state, module) 2022-11-23T03:12:18.5401709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5401852Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5402273Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5402290Z return func(*args, **kwargs) 2022-11-23T03:12:18.5402658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5402751Z p_assert( 2022-11-23T03:12:18.5403133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5403349Z traceback.print_stack() 2022-11-23T03:12:18.5403650Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5404042Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5404170Z File "", line 1, in 2022-11-23T03:12:18.5404361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5404497Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5404699Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5404843Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5405052Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5405158Z self.run() 2022-11-23T03:12:18.5405354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5405493Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5405813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5405948Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5406302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5406424Z getattr(self, test_name)() 2022-11-23T03:12:18.5406776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5406873Z fn() 2022-11-23T03:12:18.5407235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5407361Z test(self, **param_kwargs) 2022-11-23T03:12:18.5407699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5407824Z return func(*args, **kwargs) 2022-11-23T03:12:18.5408101Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5408285Z self.run_subtests( 2022-11-23T03:12:18.5408565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5408728Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5409089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5409311Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5409596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5409716Z output = model(*input) 2022-11-23T03:12:18.5410101Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5410251Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5410624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5410797Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5411160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5411275Z _lazy_init(state, module) 2022-11-23T03:12:18.5411602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5411746Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5412078Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5412267Z return func(*args, **kwargs) 2022-11-23T03:12:18.5412646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5412750Z p_assert( 2022-11-23T03:12:18.5413075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5413195Z traceback.print_stack() 2022-11-23T03:12:18.5413306Z File "", line 1, in 2022-11-23T03:12:18.5413510Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5413650Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5413851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5413994Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5414203Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5414312Z self.run() 2022-11-23T03:12:18.5414496Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5414645Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5414988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5415113Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5415469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5415592Z getattr(self, test_name)() 2022-11-23T03:12:18.5415944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5416038Z fn() 2022-11-23T03:12:18.5416381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5416614Z test(self, **param_kwargs) 2022-11-23T03:12:18.5416864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5416987Z return func(*args, **kwargs) 2022-11-23T03:12:18.5417259Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5417368Z self.run_subtests( 2022-11-23T03:12:18.5417720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5417934Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5418226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5418381Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5418753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5418924Z output = model(*input) 2022-11-23T03:12:18.5419255Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5419397Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5419765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5419940Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5420283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5420395Z _lazy_init(state, module) 2022-11-23T03:12:18.5420804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5420945Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5421352Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5421475Z return func(*args, **kwargs) 2022-11-23T03:12:18.5421848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5421942Z p_assert( 2022-11-23T03:12:18.5422257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5422381Z traceback.print_stack() 2022-11-23T03:12:18.5422625Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.5422870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.5423109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.5423342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.5423794Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5424553Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5424697Z File "", line 1, in 2022-11-23T03:12:18.5424879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5425038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5425242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5425394Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5425611Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5425635Z self.run() 2022-11-23T03:12:18.5425819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5425954Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5426291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5426424Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5426785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5426909Z getattr(self, test_name)() 2022-11-23T03:12:18.5427266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5427363Z fn() 2022-11-23T03:12:18.5427723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5427828Z test(self, **param_kwargs) 2022-11-23T03:12:18.5428186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5428311Z return func(*args, **kwargs) 2022-11-23T03:12:18.5428664Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5428785Z self.run_subtests( 2022-11-23T03:12:18.5429138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5429301Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5429661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5429795Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5430163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5430276Z output = model(*input) 2022-11-23T03:12:18.5430662Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5430805Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5431173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5431342Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5431700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5431803Z _lazy_init(state, module) 2022-11-23T03:12:18.5432145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5432281Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5432613Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5432737Z return func(*args, **kwargs) 2022-11-23T03:12:18.5433115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5433215Z p_assert( 2022-11-23T03:12:18.5433345Z File "", line 1, in 2022-11-23T03:12:18.5433663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5433792Z traceback.print_stack() 2022-11-23T03:12:18.5434006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5434150Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5434346Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5434493Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5434696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5434798Z self.run() 2022-11-23T03:12:18.5434985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5435128Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5435470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5435606Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5436022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5436090Z getattr(self, test_name)() 2022-11-23T03:12:18.5436437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5436534Z fn() 2022-11-23T03:12:18.5436881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5436999Z test(self, **param_kwargs) 2022-11-23T03:12:18.5437352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5437526Z return func(*args, **kwargs) 2022-11-23T03:12:18.5437807Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5437919Z self.run_subtests( 2022-11-23T03:12:18.5438270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5438427Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5438770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5438924Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5439299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5439414Z output = model(*input) 2022-11-23T03:12:18.5439834Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5439933Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5440305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5440483Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5440830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5440949Z _lazy_init(state, module) 2022-11-23T03:12:18.5441305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5441450Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5441782Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5441905Z return func(*args, **kwargs) 2022-11-23T03:12:18.5442348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5442449Z p_assert( 2022-11-23T03:12:18.5442767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5442889Z traceback.print_stack() 2022-11-23T03:12:18.5443281Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5443668Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.5443796Z File "", line 1, in 2022-11-23T03:12:18.5444000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5444142Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5444347Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5444484Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5444691Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5444796Z self.run() 2022-11-23T03:12:18.5445089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5445193Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5445476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5445603Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5445942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5446065Z getattr(self, test_name)() 2022-11-23T03:12:18.5446415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5446514Z fn() 2022-11-23T03:12:18.5446962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5447053Z test(self, **param_kwargs) 2022-11-23T03:12:18.5447401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5447517Z return func(*args, **kwargs) 2022-11-23T03:12:18.5447773Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5447889Z self.run_subtests( 2022-11-23T03:12:18.5448240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5448409Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5448772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5448976Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5449351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5449471Z output = model(*input) 2022-11-23T03:12:18.5449779Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5449918Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5450288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5450464Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5450921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5451073Z _lazy_init(state, module) 2022-11-23T03:12:18.5451427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5451569Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5451886Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5452008Z return func(*args, **kwargs) 2022-11-23T03:12:18.5452377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5452481Z p_assert( 2022-11-23T03:12:18.5452813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5452934Z traceback.print_stack() 2022-11-23T03:12:18.5453063Z File "", line 1, in 2022-11-23T03:12:18.5453268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5453390Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5453596Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5453745Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5453955Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5454053Z self.run() 2022-11-23T03:12:18.5454249Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5454395Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5454728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5454843Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5455202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5455409Z getattr(self, test_name)() 2022-11-23T03:12:18.5455679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5455827Z fn() 2022-11-23T03:12:18.5456197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5456324Z test(self, **param_kwargs) 2022-11-23T03:12:18.5456656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5456781Z return func(*args, **kwargs) 2022-11-23T03:12:18.5457054Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5457165Z self.run_subtests( 2022-11-23T03:12:18.5457513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5457673Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5458029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5458235Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5458607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5458709Z output = model(*input) 2022-11-23T03:12:18.5459035Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5459177Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5459556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5459729Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5460091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5460209Z _lazy_init(state, module) 2022-11-23T03:12:18.5460558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5460684Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5461015Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5461133Z return func(*args, **kwargs) 2022-11-23T03:12:18.5461498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5461593Z p_assert( 2022-11-23T03:12:18.5461918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5462038Z traceback.print_stack() 2022-11-23T03:12:18.5462284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.5462503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.5462740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.5462969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.5463360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5463487Z File "", line 1, in 2022-11-23T03:12:18.5463694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5463837Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5464391Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5464516Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5464658Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5464802Z self.run() 2022-11-23T03:12:18.5465095Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5465266Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5465606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5465744Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5466050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5466210Z getattr(self, test_name)() 2022-11-23T03:12:18.5466563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5466658Z fn() 2022-11-23T03:12:18.5467022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5467137Z test(self, **param_kwargs) 2022-11-23T03:12:18.5467462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5467581Z return func(*args, **kwargs) 2022-11-23T03:12:18.5467840Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5467949Z self.run_subtests( 2022-11-23T03:12:18.5468293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5468455Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5468806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5468954Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5469317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5469441Z output = model(*input) 2022-11-23T03:12:18.5469749Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5469893Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5470318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5470494Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5470853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5470975Z _lazy_init(state, module) 2022-11-23T03:12:18.5471325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5471471Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5471800Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5471999Z return func(*args, **kwargs) 2022-11-23T03:12:18.5472291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5472393Z p_assert( 2022-11-23T03:12:18.5472719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5472844Z traceback.print_stack() 2022-11-23T03:12:18.5473236Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5473635Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5474022Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.5474133Z File "", line 1, in 2022-11-23T03:12:18.5474398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5474550Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5474754Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5474905Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5475117Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5475219Z self.run() 2022-11-23T03:12:18.5475403Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5475547Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5475881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5476017Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5476374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5476559Z getattr(self, test_name)() 2022-11-23T03:12:18.5476916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5477013Z fn() 2022-11-23T03:12:18.5477357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5477477Z test(self, **param_kwargs) 2022-11-23T03:12:18.5477827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5477953Z return func(*args, **kwargs) 2022-11-23T03:12:18.5478081Z File "", line 1, in 2022-11-23T03:12:18.5478352Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5478467Z self.run_subtests( 2022-11-23T03:12:18.5478817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5478961Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5479205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5479348Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5479709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5479859Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5480158Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5480205Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5480577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5480775Z output = model(*input) 2022-11-23T03:12:18.5480903Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5481005Z self.run() 2022-11-23T03:12:18.5481327Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5481462Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5481666Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5481804Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5482176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5482335Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5482670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5482808Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5483263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5483385Z _lazy_init(state, module) 2022-11-23T03:12:18.5483748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5483869Z getattr(self, test_name)() 2022-11-23T03:12:18.5484215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5484388Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5484699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5484801Z fn() 2022-11-23T03:12:18.5485134Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5485273Z return func(*args, **kwargs) 2022-11-23T03:12:18.5485616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5485786Z test(self, **param_kwargs) 2022-11-23T03:12:18.5486167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5486336Z p_assert( 2022-11-23T03:12:18.5486607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5486725Z return func(*args, **kwargs) 2022-11-23T03:12:18.5487049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5487184Z traceback.print_stack() 2022-11-23T03:12:18.5487464Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5487559Z self.run_subtests( 2022-11-23T03:12:18.5487914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5488084Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5488450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5488604Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5488978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5489099Z output = model(*input) 2022-11-23T03:12:18.5489229Z File "", line 1, in 2022-11-23T03:12:18.5489534Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5489669Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5490042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5490223Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5490435Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5490574Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5490938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5491115Z _lazy_init(state, module) 2022-11-23T03:12:18.5491240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5491388Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5491736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5491879Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5492091Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5492193Z self.run() 2022-11-23T03:12:18.5492578Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5492707Z return func(*args, **kwargs) 2022-11-23T03:12:18.5492892Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5493034Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5493407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5493504Z p_assert( 2022-11-23T03:12:18.5493827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5493963Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5494294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5494415Z traceback.print_stack() 2022-11-23T03:12:18.5494813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5494931Z getattr(self, test_name)() 2022-11-23T03:12:18.5495287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5495385Z fn() 2022-11-23T03:12:18.5495741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5495858Z test(self, **param_kwargs) 2022-11-23T03:12:18.5496205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5496323Z return func(*args, **kwargs) 2022-11-23T03:12:18.5496581Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5496689Z self.run_subtests( 2022-11-23T03:12:18.5497042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5497197Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5497553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5497702Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5498072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5498188Z output = model(*input) 2022-11-23T03:12:18.5498495Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5498633Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5499058Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5499277Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5499544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5499659Z _lazy_init(state, module) 2022-11-23T03:12:18.5500002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5500137Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5500453Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5500573Z return func(*args, **kwargs) 2022-11-23T03:12:18.5500946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5501044Z p_assert( 2022-11-23T03:12:18.5501375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5501504Z traceback.print_stack() 2022-11-23T03:12:18.5501793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.5502035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.5502247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.5502473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.5502865Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5502988Z File "", line 1, in 2022-11-23T03:12:18.5503193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5503329Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5503575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5503811Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5504162Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5504276Z self.run() 2022-11-23T03:12:18.5504580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5504670Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5505049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5505179Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5505532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5505685Z getattr(self, test_name)() 2022-11-23T03:12:18.5506017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5506112Z fn() 2022-11-23T03:12:18.5506462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5506579Z test(self, **param_kwargs) 2022-11-23T03:12:18.5506959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5507086Z return func(*args, **kwargs) 2022-11-23T03:12:18.5507343Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5507371Z self.run_subtests( 2022-11-23T03:12:18.5507706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5507861Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5508219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5508376Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5508747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5508860Z output = model(*input) 2022-11-23T03:12:18.5509185Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5509322Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5509771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5509853Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5510291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5510377Z _lazy_init(state, module) 2022-11-23T03:12:18.5510816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5510993Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5511259Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5511350Z return func(*args, **kwargs) 2022-11-23T03:12:18.5511777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5511882Z p_assert( 2022-11-23T03:12:18.5512177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5512276Z traceback.print_stack() 2022-11-23T03:12:18.5512641Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5513033Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5513524Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.5513652Z File "", line 1, in 2022-11-23T03:12:18.5513843Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5513984Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5514181Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5514329Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5514452Z File "", line 1, in 2022-11-23T03:12:18.5514662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5514758Z self.run() 2022-11-23T03:12:18.5514958Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5515090Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5515311Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5515433Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5515769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5515898Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5516094Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5516238Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5516579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5516706Z getattr(self, test_name)() 2022-11-23T03:12:18.5516908Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5517066Z self.run() 2022-11-23T03:12:18.5517349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5517448Z fn() 2022-11-23T03:12:18.5517649Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5517791Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5518237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5518302Z test(self, **param_kwargs) 2022-11-23T03:12:18.5518599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5518726Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5519139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5519266Z return func(*args, **kwargs) 2022-11-23T03:12:18.5519630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5519811Z getattr(self, test_name)() 2022-11-23T03:12:18.5520084Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5520214Z self.run_subtests( 2022-11-23T03:12:18.5520565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5520647Z fn() 2022-11-23T03:12:18.5520927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5521057Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5521454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5521620Z test(self, **param_kwargs) 2022-11-23T03:12:18.5521962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5522177Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5522548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5522675Z return func(*args, **kwargs) 2022-11-23T03:12:18.5523033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5523137Z output = model(*input) 2022-11-23T03:12:18.5523420Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5523440Z self.run_subtests( 2022-11-23T03:12:18.5523824Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5523885Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5524340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5524398Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5524774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5524946Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5525301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5525447Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5525793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5525907Z _lazy_init(state, module) 2022-11-23T03:12:18.5526275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5526391Z output = model(*input) 2022-11-23T03:12:18.5526756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5526895Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5527210Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5527396Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5527715Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5527898Z return func(*args, **kwargs) 2022-11-23T03:12:18.5528209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5528378Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5528749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5528894Z p_assert( 2022-11-23T03:12:18.5529268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5529392Z _lazy_init(state, module) 2022-11-23T03:12:18.5529708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5529828Z traceback.print_stack() 2022-11-23T03:12:18.5530169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5530309Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5530440Z File "", line 1, in 2022-11-23T03:12:18.5530779Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5531005Z return func(*args, **kwargs) 2022-11-23T03:12:18.5531443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5531528Z p_assert( 2022-11-23T03:12:18.5531737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5531874Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5532213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5532335Z traceback.print_stack() 2022-11-23T03:12:18.5532528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5532674Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5532880Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5532967Z self.run() 2022-11-23T03:12:18.5533168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5533316Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5533655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5533879Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5534137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5534259Z getattr(self, test_name)() 2022-11-23T03:12:18.5534594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5534688Z fn() 2022-11-23T03:12:18.5535051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5535168Z test(self, **param_kwargs) 2022-11-23T03:12:18.5535516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5535642Z return func(*args, **kwargs) 2022-11-23T03:12:18.5535988Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5536028Z self.run_subtests( 2022-11-23T03:12:18.5536361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5536518Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5536875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5537022Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5537392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5537505Z output = model(*input) 2022-11-23T03:12:18.5537822Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5537962Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5538363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5538544Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5538909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5539029Z _lazy_init(state, module) 2022-11-23T03:12:18.5539369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5539507Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5539838Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5539958Z return func(*args, **kwargs) 2022-11-23T03:12:18.5540328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5540483Z p_assert( 2022-11-23T03:12:18.5540820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5540946Z traceback.print_stack() 2022-11-23T03:12:18.5541187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.5541484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.5541721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.5541943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.5542399Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5542519Z File "", line 1, in 2022-11-23T03:12:18.5542733Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5542869Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5543071Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5543217Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5543517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5543541Z self.run() 2022-11-23T03:12:18.5543758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5544192Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5544546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5544664Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5545026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5545163Z getattr(self, test_name)() 2022-11-23T03:12:18.5545525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5545617Z fn() 2022-11-23T03:12:18.5545973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5546095Z test(self, **param_kwargs) 2022-11-23T03:12:18.5546449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5546574Z return func(*args, **kwargs) 2022-11-23T03:12:18.5546846Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5546952Z self.run_subtests( 2022-11-23T03:12:18.5547268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5547452Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5547805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5547955Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5548328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5548447Z output = model(*input) 2022-11-23T03:12:18.5548770Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5548911Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5549282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5549448Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5549867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5549983Z _lazy_init(state, module) 2022-11-23T03:12:18.5550330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5550465Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5550793Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5550916Z return func(*args, **kwargs) 2022-11-23T03:12:18.5551291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5551386Z p_assert( 2022-11-23T03:12:18.5551790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5551828Z traceback.print_stack() 2022-11-23T03:12:18.5552224Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5552615Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5553008Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.5553131Z File "", line 1, in 2022-11-23T03:12:18.5553338Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5553479Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5553681Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5553814Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5554024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5554130Z self.run() 2022-11-23T03:12:18.5554404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5554482Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5554819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5554950Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5555360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5555406Z getattr(self, test_name)() 2022-11-23T03:12:18.5555781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5555856Z fn() 2022-11-23T03:12:18.5556214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5556337Z test(self, **param_kwargs) 2022-11-23T03:12:18.5556735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5556866Z return func(*args, **kwargs) 2022-11-23T03:12:18.5557124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5557237Z self.run_subtests( 2022-11-23T03:12:18.5557591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5557749Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5558104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5558247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5558612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5558774Z output = model(*input) 2022-11-23T03:12:18.5559086Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5559226Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5559595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5559765Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5560125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5560241Z _lazy_init(state, module) 2022-11-23T03:12:18.5560581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5560722Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5561036Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5561165Z return func(*args, **kwargs) 2022-11-23T03:12:18.5561543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5561644Z p_assert( 2022-11-23T03:12:18.5561972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5562097Z traceback.print_stack() 2022-11-23T03:12:18.5562222Z File "", line 1, in 2022-11-23T03:12:18.5562429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5562553Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5562750Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5562897Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5563106Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5563213Z self.run() 2022-11-23T03:12:18.5563417Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5563559Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5563880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5564006Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5564362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5564481Z getattr(self, test_name)() 2022-11-23T03:12:18.5564834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5564931Z fn() 2022-11-23T03:12:18.5565293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5565415Z test(self, **param_kwargs) 2022-11-23T03:12:18.5565801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5565925Z return func(*args, **kwargs) 2022-11-23T03:12:18.5566194Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5566303Z self.run_subtests( 2022-11-23T03:12:18.5566658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5566814Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5567175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5567326Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5567679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5567861Z output = model(*input) 2022-11-23T03:12:18.5568188Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5568328Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5568698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5568875Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5569238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5569438Z _lazy_init(state, module) 2022-11-23T03:12:18.5569765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5569916Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5570325Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5570440Z return func(*args, **kwargs) 2022-11-23T03:12:18.5570824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5570914Z p_assert( 2022-11-23T03:12:18.5571252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5571343Z traceback.print_stack() 2022-11-23T03:12:18.5571468Z File "", line 1, in 2022-11-23T03:12:18.5571701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5571901Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5572039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5572224Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5572405Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5572508Z self.run() 2022-11-23T03:12:18.5572706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5572832Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5573165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5573295Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5573646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5573769Z getattr(self, test_name)() 2022-11-23T03:12:18.5574119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5574212Z fn() 2022-11-23T03:12:18.5574565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5574675Z test(self, **param_kwargs) 2022-11-23T03:12:18.5575080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5575207Z return func(*args, **kwargs) 2022-11-23T03:12:18.5575478Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5575587Z self.run_subtests( 2022-11-23T03:12:18.5575938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5576097Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5576457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5576592Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5576964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5577137Z output = model(*input) 2022-11-23T03:12:18.5577459Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5577598Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5577967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5578144Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5578509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5578612Z _lazy_init(state, module) 2022-11-23T03:12:18.5578955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5579098Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5579452Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5579564Z return func(*args, **kwargs) 2022-11-23T03:12:18.5579934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5580035Z p_assert( 2022-11-23T03:12:18.5580365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5580473Z traceback.print_stack() 2022-11-23T03:12:18.5580712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.5580944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.5581169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.5581397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.5581795Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5581923Z File "", line 1, in 2022-11-23T03:12:18.5582126Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5582249Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5582451Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5582594Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5582804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5582903Z self.run() 2022-11-23T03:12:18.5583104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5583243Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5583630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5583751Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5584620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5584747Z getattr(self, test_name)() 2022-11-23T03:12:18.5585102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5585197Z fn() 2022-11-23T03:12:18.5585552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5585679Z test(self, **param_kwargs) 2022-11-23T03:12:18.5585998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5586051Z return func(*args, **kwargs) 2022-11-23T03:12:18.5586418Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5586531Z self.run_subtests( 2022-11-23T03:12:18.5586881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5587037Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5587393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5587543Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5587915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5588018Z output = model(*input) 2022-11-23T03:12:18.5588340Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5588479Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5588850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5589021Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5589386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5589497Z _lazy_init(state, module) 2022-11-23T03:12:18.5589841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5589965Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5590299Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5590421Z return func(*args, **kwargs) 2022-11-23T03:12:18.5590799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5590898Z p_assert( 2022-11-23T03:12:18.5591233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5591358Z traceback.print_stack() 2022-11-23T03:12:18.5591750Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5592131Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5592525Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.5592647Z File "", line 1, in 2022-11-23T03:12:18.5592854Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5592994Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5593200Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5593407Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5593631Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5593716Z self.run() 2022-11-23T03:12:18.5593914Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5594053Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5594396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5594523Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5594878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5594998Z getattr(self, test_name)() 2022-11-23T03:12:18.5595353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5595483Z fn() 2022-11-23T03:12:18.5595852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5595965Z test(self, **param_kwargs) 2022-11-23T03:12:18.5596317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5596437Z return func(*args, **kwargs) 2022-11-23T03:12:18.5596560Z File "", line 1, in 2022-11-23T03:12:18.5596827Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5596922Z self.run_subtests( 2022-11-23T03:12:18.5597266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5597419Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5597627Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5597767Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5598122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5598274Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5598474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5598606Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5598978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5599094Z output = model(*input) 2022-11-23T03:12:18.5599302Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5599402Z self.run() 2022-11-23T03:12:18.5599720Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5599866Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5600068Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5600195Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5600565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5600739Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5601071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5601202Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5601566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5601686Z _lazy_init(state, module) 2022-11-23T03:12:18.5602036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5602191Z getattr(self, test_name)() 2022-11-23T03:12:18.5602549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5602687Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5603038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5603126Z fn() 2022-11-23T03:12:18.5603457Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5603580Z return func(*args, **kwargs) 2022-11-23T03:12:18.5603946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5604051Z test(self, **param_kwargs) 2022-11-23T03:12:18.5604421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5604583Z p_assert( 2022-11-23T03:12:18.5604935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5605058Z return func(*args, **kwargs) 2022-11-23T03:12:18.5605391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5605513Z traceback.print_stack() 2022-11-23T03:12:18.5605787Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5605884Z self.run_subtests( 2022-11-23T03:12:18.5606235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5606394Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5606765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5606914Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5607283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5607495Z output = model(*input) 2022-11-23T03:12:18.5607728Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5607851Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5608215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5608388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5608747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5608867Z _lazy_init(state, module) 2022-11-23T03:12:18.5609215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5609354Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5609685Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5609791Z return func(*args, **kwargs) 2022-11-23T03:12:18.5610160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5610255Z p_assert( 2022-11-23T03:12:18.5610583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5610706Z traceback.print_stack() 2022-11-23T03:12:18.5610827Z File "", line 1, in 2022-11-23T03:12:18.5611027Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5611166Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5611400Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5611549Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5611760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5611858Z self.run() 2022-11-23T03:12:18.5612053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5612197Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5612535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5612650Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5613006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5613129Z getattr(self, test_name)() 2022-11-23T03:12:18.5613545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5613644Z fn() 2022-11-23T03:12:18.5614000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5614115Z test(self, **param_kwargs) 2022-11-23T03:12:18.5614464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5614569Z return func(*args, **kwargs) 2022-11-23T03:12:18.5614853Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5614965Z self.run_subtests( 2022-11-23T03:12:18.5615413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5615576Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5616009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5616093Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5616462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5616564Z output = model(*input) 2022-11-23T03:12:18.5616882Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5617033Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5617385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5617552Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5617908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5618028Z _lazy_init(state, module) 2022-11-23T03:12:18.5618379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5618503Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5618832Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5618951Z return func(*args, **kwargs) 2022-11-23T03:12:18.5619321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5619420Z p_assert( 2022-11-23T03:12:18.5619745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5619865Z traceback.print_stack() 2022-11-23T03:12:18.5620112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.5620390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.5620629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.5620858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.5621257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5621651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5622045Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5622429Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.5622713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.5622943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.5623154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.5623543Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5623784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.5624499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5624861Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5625251Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.5625506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.5625740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.5625953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.5626321Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5626566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.5626953Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5627242Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5627628Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.5627862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.5628087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.5628315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.5628700Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5628929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.5629298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5629749Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5630150Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.5630385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.5630613Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.5630844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.5631226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5631454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.5631833Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5632286Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5632652Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.5633396Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5634129Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5634873Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5635612Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5636439Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5637086Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5637819Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5638546Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5639318Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5640049Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5640775Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5641065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.5641298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.5641523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.5641916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5642159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.5642613Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5643006Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5643385Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.5643621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.5643849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.5644077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.5644457Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5644691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.5645083Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5645475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5645867Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.5646081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.5646315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.5646542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.5646921Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5647155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.5647541Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5647976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5648373Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.5648601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.5648827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.5649038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.5649418Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5649652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.5650088Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5650474Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5650858Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.5651083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.5651313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.5651627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.5651906Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5652249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.5652631Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5652920Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5653301Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.5653536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.5653762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.5653986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.5654375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5654593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.5654980Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5655354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5655735Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.5655985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.5656215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.5656449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.5656913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5657402Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5657656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.5658023Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5658401Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.5658529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.5658750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.5659027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.5659410Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.5659787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.5660021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.5660399Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.5660762Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.5660990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.5661221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.5661446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.5661829Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.5662065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.5662444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.5662827Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.5663209Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.5663445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.5663656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.5664362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.5664769Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.5665185Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.5665425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.5665810Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.5666279Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.5666424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.5666652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.5666863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.5667251Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.5667486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.5667975Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.5668254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.5668713Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.5668938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.5669160Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.5669384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.5669746Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.5669976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.5670497Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.5670805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.5671189Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.5671934Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5672671Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5673412Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5674169Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5674855Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5675635Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5676368Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5677092Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5677866Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.5678102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.5678334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.5678558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.5678948Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.5679168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.5679564Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.5679957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.5680335Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.5680638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.5680800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.5681024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.5681450Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.5681656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.5682025Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.5682409Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.5682791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.5683021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.5683244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.5683467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.5683848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.5684280Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.5684523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.5684913Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.5685374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.5685510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.5685811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.5685991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.5686400Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.5686636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.5687016Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.5687396Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.5687774Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.5688038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.5688224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.5688522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.5688903Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.5689131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.5689511Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.5689891Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.5690269Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.5690497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.5690716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.5690941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.5691327Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.5691560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.5691943Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.5692322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.5692697Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.5692976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.5693204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.5693428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.5693798Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.5694028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.5694406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.5694787Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.5695275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.5695340Z dist init r=1, world=4 2022-11-23T03:12:18.5695664Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5695976Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5696280Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5696582Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5696870Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5697169Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5697465Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5697760Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5698051Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5698342Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5698644Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5699013Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.5699048Z dist init r=0, world=4 2022-11-23T03:12:18.5699370Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5699675Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5699963Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5700309Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5700672Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5700912Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5701205Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5701501Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5701846Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5702142Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5702434Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5702821Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.5702926Z dist init r=3, world=4 2022-11-23T03:12:18.5703231Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5703545Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5704155Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5704515Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5704862Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5705161Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5705452Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5705771Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5706058Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5706261Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5706554Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5706849Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.5706947Z dist init r=2, world=4 2022-11-23T03:12:18.5707333Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5707653Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5707958Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5708254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5708585Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5708943Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5709237Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5709535Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5709829Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5710126Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5710428Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5710714Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.5710854Z ok (31.780s) 2022-11-23T03:12:18.5711167Z test_mixture_of_experts_with_delay_before_free_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20902 2022-11-23T03:12:18.5711385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20903 2022-11-23T03:12:18.5711600Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20904 2022-11-23T03:12:18.5711809Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20905 2022-11-23T03:12:18.5712194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5712373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5712749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5712925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5713289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5713460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5713829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5714011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5714371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5714543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5714961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5715139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5715503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.5715671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.5716038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.5716297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.5716466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.5716705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.5716992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.5717229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.5717610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5718002Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5718383Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5718759Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.5718981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.5719218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.5719442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.5719662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.5720730Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5720846Z warnings.warn( 2022-11-23T03:12:18.5721858Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5721970Z warnings.warn( 2022-11-23T03:12:18.5722960Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5723075Z warnings.warn( 2022-11-23T03:12:18.5723318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.5723607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.5724621Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.5724828Z warnings.warn( 2022-11-23T03:12:18.5724961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.5725201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.5725662Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5725799Z File "", line 1, in 2022-11-23T03:12:18.5725991Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5726135Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5726334Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5726478Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5726688Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5726794Z self.run() 2022-11-23T03:12:18.5726993Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5727140Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5727465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5727603Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5727971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5728096Z getattr(self, test_name)() 2022-11-23T03:12:18.5728459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5728560Z fn() 2022-11-23T03:12:18.5728926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5729128Z test(self, **param_kwargs) 2022-11-23T03:12:18.5729388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5729512Z return func(*args, **kwargs) 2022-11-23T03:12:18.5729782Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5729901Z self.run_subtests( 2022-11-23T03:12:18.5730254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5730420Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5730778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5730929Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5731286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5731404Z output = model(*input) 2022-11-23T03:12:18.5731734Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5731868Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5732242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5732471Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5732847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5732964Z _lazy_init(state, module) 2022-11-23T03:12:18.5733296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5733434Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5733766Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5733891Z return func(*args, **kwargs) 2022-11-23T03:12:18.5734261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5734357Z p_assert( 2022-11-23T03:12:18.5734741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5734869Z traceback.print_stack() 2022-11-23T03:12:18.5735248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5735637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5736017Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.5736147Z File "", line 1, in 2022-11-23T03:12:18.5736276Z File "", line 1, in 2022-11-23T03:12:18.5736484Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5736625Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5736872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5736964Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5737177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5737317Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5737531Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5737633Z self.run() 2022-11-23T03:12:18.5737832Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5738039Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5738183Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5738311Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5738519Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5738615Z self.run() 2022-11-23T03:12:18.5738959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5739096Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5739293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5739434Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5739774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5739896Z getattr(self, test_name)() 2022-11-23T03:12:18.5740310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5740364Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5740718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5740815Z fn() 2022-11-23T03:12:18.5741247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5741287Z getattr(self, test_name)() 2022-11-23T03:12:18.5741680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5741808Z test(self, **param_kwargs) 2022-11-23T03:12:18.5742157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5742302Z fn() 2022-11-23T03:12:18.5742658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5742779Z return func(*args, **kwargs) 2022-11-23T03:12:18.5743129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5743245Z test(self, **param_kwargs) 2022-11-23T03:12:18.5743502Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5743667Z self.run_subtests( 2022-11-23T03:12:18.5744466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5744683Z return func(*args, **kwargs) 2022-11-23T03:12:18.5745039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5745198Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5745472Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5745592Z self.run_subtests( 2022-11-23T03:12:18.5745945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5746034Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5746353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5746515Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5746889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5747010Z output = model(*input) 2022-11-23T03:12:18.5747363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5747510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5747817Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5748011Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5748335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5748457Z output = model(*input) 2022-11-23T03:12:18.5748840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5749016Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5749337Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5749475Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5749823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5749945Z _lazy_init(state, module) 2022-11-23T03:12:18.5750319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5750491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5750833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5751053Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5751423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5751543Z _lazy_init(state, module) 2022-11-23T03:12:18.5751951Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5751988Z return func(*args, **kwargs) 2022-11-23T03:12:18.5752325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5752464Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5752841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5752945Z p_assert( 2022-11-23T03:12:18.5753276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5753468Z return func(*args, **kwargs) 2022-11-23T03:12:18.5753790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5753915Z traceback.print_stack() 2022-11-23T03:12:18.5754289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5754392Z p_assert( 2022-11-23T03:12:18.5754714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5754834Z traceback.print_stack() 2022-11-23T03:12:18.5754962Z File "", line 1, in 2022-11-23T03:12:18.5755173Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5755297Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5755499Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5755647Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5755859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5755958Z self.run() 2022-11-23T03:12:18.5756157Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5756293Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5756613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5756752Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5757108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5757229Z getattr(self, test_name)() 2022-11-23T03:12:18.5757584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5757682Z fn() 2022-11-23T03:12:18.5758044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5758161Z test(self, **param_kwargs) 2022-11-23T03:12:18.5758497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5758620Z return func(*args, **kwargs) 2022-11-23T03:12:18.5758892Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5759007Z self.run_subtests( 2022-11-23T03:12:18.5759353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5759511Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5759866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5760021Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5760437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5760561Z output = model(*input) 2022-11-23T03:12:18.5760890Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5761033Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5761406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5761579Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5761940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5762054Z _lazy_init(state, module) 2022-11-23T03:12:18.5762396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5762575Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5762913Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5763035Z return func(*args, **kwargs) 2022-11-23T03:12:18.5763408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5763506Z p_assert( 2022-11-23T03:12:18.5763836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5763959Z traceback.print_stack() 2022-11-23T03:12:18.5764187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.5764427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.5764669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.5764910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.5765306Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5765430Z File "", line 1, in 2022-11-23T03:12:18.5765637Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5765779Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5765977Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5766109Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5766318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5766425Z self.run() 2022-11-23T03:12:18.5766627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5766780Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5767113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5767247Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5767590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5767711Z getattr(self, test_name)() 2022-11-23T03:12:18.5768070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5768166Z fn() 2022-11-23T03:12:18.5768527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5768645Z test(self, **param_kwargs) 2022-11-23T03:12:18.5768996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5769167Z return func(*args, **kwargs) 2022-11-23T03:12:18.5769434Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5769612Z self.run_subtests( 2022-11-23T03:12:18.5769905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5770095Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5770479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5770628Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5771096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5771121Z output = model(*input) 2022-11-23T03:12:18.5771487Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5771629Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5771995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5772169Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5772534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5772653Z _lazy_init(state, module) 2022-11-23T03:12:18.5773002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5773139Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5773457Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5773585Z return func(*args, **kwargs) 2022-11-23T03:12:18.5773967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5774066Z p_assert( 2022-11-23T03:12:18.5774399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5774523Z traceback.print_stack() 2022-11-23T03:12:18.5774917Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5775308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5775687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.5775798Z File "", line 1, in 2022-11-23T03:12:18.5775999Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5776143Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5776342Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5776582Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5776704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5776799Z self.run() 2022-11-23T03:12:18.5776984Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5777130Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5777471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5777598Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5777954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5778079Z getattr(self, test_name)() 2022-11-23T03:12:18.5778482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5778588Z fn() 2022-11-23T03:12:18.5778937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5779055Z test(self, **param_kwargs) 2022-11-23T03:12:18.5779403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5779524Z return func(*args, **kwargs) 2022-11-23T03:12:18.5779801Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5779908Z self.run_subtests( 2022-11-23T03:12:18.5780346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5780414Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5780817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5781018Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5781344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5781460Z output = model(*input) 2022-11-23T03:12:18.5781799Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5781914Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5782288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5782466Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5782813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5782944Z _lazy_init(state, module) 2022-11-23T03:12:18.5783289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5783432Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5783762Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5784259Z return func(*args, **kwargs) 2022-11-23T03:12:18.5784398Z File "", line 1, in 2022-11-23T03:12:18.5784762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5784847Z p_assert( 2022-11-23T03:12:18.5785196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5785309Z traceback.print_stack() 2022-11-23T03:12:18.5785522Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5785585Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5785774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5785921Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5786135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5786315Z self.run() 2022-11-23T03:12:18.5786425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5786569Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5786909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5787040Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5787399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5787522Z getattr(self, test_name)() 2022-11-23T03:12:18.5787933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5788038Z fn() 2022-11-23T03:12:18.5788409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5788528Z test(self, **param_kwargs) 2022-11-23T03:12:18.5788875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5788997Z return func(*args, **kwargs) 2022-11-23T03:12:18.5789269Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5789375Z self.run_subtests( 2022-11-23T03:12:18.5789707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5789985Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5790351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5790504Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5790868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5790984Z output = model(*input) 2022-11-23T03:12:18.5791310Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5791445Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5791810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5791969Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5792332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5792457Z _lazy_init(state, module) 2022-11-23T03:12:18.5792802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5792938Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5793271Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5793395Z return func(*args, **kwargs) 2022-11-23T03:12:18.5793770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5793854Z p_assert( 2022-11-23T03:12:18.5794193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5794312Z traceback.print_stack() 2022-11-23T03:12:18.5794440Z File "", line 1, in 2022-11-23T03:12:18.5794650Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5794788Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5794988Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5795120Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5795331Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5795430Z self.run() 2022-11-23T03:12:18.5795632Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5795777Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5796113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5796241Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5796595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5796704Z getattr(self, test_name)() 2022-11-23T03:12:18.5797104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5797205Z fn() 2022-11-23T03:12:18.5797570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5797686Z test(self, **param_kwargs) 2022-11-23T03:12:18.5798035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5798157Z return func(*args, **kwargs) 2022-11-23T03:12:18.5798435Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5798531Z self.run_subtests( 2022-11-23T03:12:18.5798882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5799094Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5799457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5799605Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5799974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5800092Z output = model(*input) 2022-11-23T03:12:18.5800414Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5800536Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5800909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5801082Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5801452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5801568Z _lazy_init(state, module) 2022-11-23T03:12:18.5801918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5802054Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5802386Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5802491Z return func(*args, **kwargs) 2022-11-23T03:12:18.5802864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5802957Z p_assert( 2022-11-23T03:12:18.5803286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5803408Z traceback.print_stack() 2022-11-23T03:12:18.5803654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.5803902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.5804140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.5804362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.5804765Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5804946Z File "", line 1, in 2022-11-23T03:12:18.5805105Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5805245Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5805442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5805598Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5805850Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5805943Z self.run() 2022-11-23T03:12:18.5806146Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5806289Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5806627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5806759Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5807120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5807238Z getattr(self, test_name)() 2022-11-23T03:12:18.5807590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5807671Z fn() 2022-11-23T03:12:18.5808038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5808255Z test(self, **param_kwargs) 2022-11-23T03:12:18.5808566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5808693Z return func(*args, **kwargs) 2022-11-23T03:12:18.5808968Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5809082Z self.run_subtests( 2022-11-23T03:12:18.5809424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5809568Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5809928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5810074Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5810451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5810568Z output = model(*input) 2022-11-23T03:12:18.5810892Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5811031Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5811445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5811561Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5811921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5812034Z _lazy_init(state, module) 2022-11-23T03:12:18.5812380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5812524Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5812858Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5812978Z return func(*args, **kwargs) 2022-11-23T03:12:18.5813353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5813437Z p_assert( 2022-11-23T03:12:18.5813768Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5813894Z traceback.print_stack() 2022-11-23T03:12:18.5814280Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5814675Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5815101Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.5815237Z File "", line 1, in 2022-11-23T03:12:18.5815445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5815568Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5815769Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5815913Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5816119Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5816220Z self.run() 2022-11-23T03:12:18.5816418Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5816563Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5816903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5817065Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5817434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5817555Z getattr(self, test_name)() 2022-11-23T03:12:18.5817926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5818010Z fn() 2022-11-23T03:12:18.5818374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5818495Z test(self, **param_kwargs) 2022-11-23T03:12:18.5818834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5818960Z return func(*args, **kwargs) 2022-11-23T03:12:18.5819237Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5819352Z self.run_subtests( 2022-11-23T03:12:18.5819705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5819867Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5820224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5820377Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5820751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5820853Z output = model(*input) 2022-11-23T03:12:18.5821181Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5821316Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5821685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5821867Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5822229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5822346Z _lazy_init(state, module) 2022-11-23T03:12:18.5822687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5822812Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5823243Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5823291Z return func(*args, **kwargs) 2022-11-23T03:12:18.5823660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5823750Z p_assert( 2022-11-23T03:12:18.5824495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5824705Z traceback.print_stack() 2022-11-23T03:12:18.5824828Z File "", line 1, in 2022-11-23T03:12:18.5825042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5825179Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5825377Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5825513Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5825733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5825822Z self.run() 2022-11-23T03:12:18.5825971Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5826057Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5826407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5826601Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5826961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5827085Z getattr(self, test_name)() 2022-11-23T03:12:18.5827440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5827532Z fn() 2022-11-23T03:12:18.5827891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5827995Z test(self, **param_kwargs) 2022-11-23T03:12:18.5828346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5828473Z return func(*args, **kwargs) 2022-11-23T03:12:18.5828745Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5828857Z self.run_subtests( 2022-11-23T03:12:18.5829281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5829501Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5829802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5829936Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5830305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5830417Z output = model(*input) 2022-11-23T03:12:18.5830743Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5830881Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5831253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5831426Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5831784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5831887Z _lazy_init(state, module) 2022-11-23T03:12:18.5832233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5832373Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5832709Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5832835Z return func(*args, **kwargs) 2022-11-23T03:12:18.5833208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5833312Z p_assert( 2022-11-23T03:12:18.5833690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5833805Z traceback.print_stack() 2022-11-23T03:12:18.5833931Z File "", line 1, in 2022-11-23T03:12:18.5834135Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5834272Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5834572Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5834709Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5834929Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5834969Z self.run() 2022-11-23T03:12:18.5835201Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5835266Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5835670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5835786Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5836141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5836265Z getattr(self, test_name)() 2022-11-23T03:12:18.5836615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5836693Z fn() 2022-11-23T03:12:18.5837053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5837175Z test(self, **param_kwargs) 2022-11-23T03:12:18.5837526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5837663Z return func(*args, **kwargs) 2022-11-23T03:12:18.5838018Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5838063Z self.run_subtests( 2022-11-23T03:12:18.5838391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5838595Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5838894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5839044Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5839509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5839536Z output = model(*input) 2022-11-23T03:12:18.5839851Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5839983Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5840355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5840512Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5840876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5840993Z _lazy_init(state, module) 2022-11-23T03:12:18.5841339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5841478Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5841807Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5841930Z return func(*args, **kwargs) 2022-11-23T03:12:18.5842355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5842445Z p_assert( 2022-11-23T03:12:18.5842827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5842954Z traceback.print_stack() 2022-11-23T03:12:18.5843189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.5843423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.5843669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.5843901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.5844304Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5844432Z File "", line 1, in 2022-11-23T03:12:18.5844622Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5844816Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5845016Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5845161Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5845368Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5845468Z self.run() 2022-11-23T03:12:18.5845665Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5845792Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5846128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5846255Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5846608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5846734Z getattr(self, test_name)() 2022-11-23T03:12:18.5847096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5847189Z fn() 2022-11-23T03:12:18.5847547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5847651Z test(self, **param_kwargs) 2022-11-23T03:12:18.5848079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5848126Z return func(*args, **kwargs) 2022-11-23T03:12:18.5848461Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5848512Z self.run_subtests( 2022-11-23T03:12:18.5848860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5849017Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5849377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5849512Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5849886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5849997Z output = model(*input) 2022-11-23T03:12:18.5850319Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5850554Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5850818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5850986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5851341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5851499Z _lazy_init(state, module) 2022-11-23T03:12:18.5851855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5851993Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5852413Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5852448Z return func(*args, **kwargs) 2022-11-23T03:12:18.5852819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5852913Z p_assert( 2022-11-23T03:12:18.5853248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5853354Z traceback.print_stack() 2022-11-23T03:12:18.5853743Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5854204Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5854583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.5854711Z File "", line 1, in 2022-11-23T03:12:18.5854918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5855054Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5855256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5855389Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5855597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5855696Z self.run() 2022-11-23T03:12:18.5855895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5856049Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5856389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5856517Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5856874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5856979Z getattr(self, test_name)() 2022-11-23T03:12:18.5857333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5857423Z fn() 2022-11-23T03:12:18.5857782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5857903Z test(self, **param_kwargs) 2022-11-23T03:12:18.5858246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5858375Z return func(*args, **kwargs) 2022-11-23T03:12:18.5858646Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5858742Z self.run_subtests( 2022-11-23T03:12:18.5859086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5859245Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5859602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5859752Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5860123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5860239Z output = model(*input) 2022-11-23T03:12:18.5860609Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5860739Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5861109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5861279Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5861645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5861763Z _lazy_init(state, module) 2022-11-23T03:12:18.5862110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5862249Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5862585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5862801Z return func(*args, **kwargs) 2022-11-23T03:12:18.5863122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5863219Z p_assert( 2022-11-23T03:12:18.5863559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5863685Z traceback.print_stack() 2022-11-23T03:12:18.5863809Z File "", line 1, in 2022-11-23T03:12:18.5864373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5864436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5864611Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5864839Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5865067Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5865162Z self.run() 2022-11-23T03:12:18.5865346Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5865495Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5865844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5865965Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5866324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5866421Z getattr(self, test_name)() 2022-11-23T03:12:18.5866703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5866799Z fn() 2022-11-23T03:12:18.5867159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5867279Z test(self, **param_kwargs) 2022-11-23T03:12:18.5867636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5867749Z return func(*args, **kwargs) 2022-11-23T03:12:18.5868023Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5868135Z self.run_subtests( 2022-11-23T03:12:18.5868480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5868637Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5868996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5869145Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5869518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5869623Z output = model(*input) 2022-11-23T03:12:18.5870019Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5870217Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5870602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5870780Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5871146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5871345Z _lazy_init(state, module) 2022-11-23T03:12:18.5871603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5871728Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5872060Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5872250Z return func(*args, **kwargs) 2022-11-23T03:12:18.5872673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5872727Z p_assert( 2022-11-23T03:12:18.5873057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5873211Z traceback.print_stack() 2022-11-23T03:12:18.5873307Z File "", line 1, in 2022-11-23T03:12:18.5873498Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5873641Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5873837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5873985Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5874194Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5874298Z self.run() 2022-11-23T03:12:18.5874503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5874647Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5874968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5875101Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5875459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5875578Z getattr(self, test_name)() 2022-11-23T03:12:18.5875934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5876031Z fn() 2022-11-23T03:12:18.5876386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5876492Z test(self, **param_kwargs) 2022-11-23T03:12:18.5876854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5876971Z return func(*args, **kwargs) 2022-11-23T03:12:18.5877244Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5877358Z self.run_subtests( 2022-11-23T03:12:18.5877705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5877864Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5878286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5878369Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5878723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5878838Z output = model(*input) 2022-11-23T03:12:18.5879209Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5879352Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5879725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5879890Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5880255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5880371Z _lazy_init(state, module) 2022-11-23T03:12:18.5880788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5880931Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5881266Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5881431Z return func(*args, **kwargs) 2022-11-23T03:12:18.5881813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5881910Z p_assert( 2022-11-23T03:12:18.5882242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5882361Z traceback.print_stack() 2022-11-23T03:12:18.5882588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.5882833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.5883070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.5883309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.5883712Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5883837Z File "", line 1, in 2022-11-23T03:12:18.5884042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5884176Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5884361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5884507Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5884716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5884819Z self.run() 2022-11-23T03:12:18.5885017Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5885159Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5885496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5885613Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5885973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5886096Z getattr(self, test_name)() 2022-11-23T03:12:18.5886449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5886541Z fn() 2022-11-23T03:12:18.5886897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5887020Z test(self, **param_kwargs) 2022-11-23T03:12:18.5887371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5887476Z return func(*args, **kwargs) 2022-11-23T03:12:18.5887769Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5887862Z self.run_subtests( 2022-11-23T03:12:18.5888253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5888419Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5888782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5888931Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5889297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5889397Z output = model(*input) 2022-11-23T03:12:18.5889736Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5889876Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5890243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5890462Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5890833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5890952Z _lazy_init(state, module) 2022-11-23T03:12:18.5891300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5891446Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5891764Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5891888Z return func(*args, **kwargs) 2022-11-23T03:12:18.5892256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5892353Z p_assert( 2022-11-23T03:12:18.5892709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5892814Z traceback.print_stack() 2022-11-23T03:12:18.5893205Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5893598Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5893970Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.5894094Z File "", line 1, in 2022-11-23T03:12:18.5894301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5894434Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5894633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5894783Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5894992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5895078Z self.run() 2022-11-23T03:12:18.5895279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5895416Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5895755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5895885Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5896237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5896354Z getattr(self, test_name)() 2022-11-23T03:12:18.5896706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5896785Z fn() 2022-11-23T03:12:18.5897191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5897325Z test(self, **param_kwargs) 2022-11-23T03:12:18.5897684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5897801Z return func(*args, **kwargs) 2022-11-23T03:12:18.5898073Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5898175Z self.run_subtests( 2022-11-23T03:12:18.5898519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5898661Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5899018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5899166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5899587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5899708Z output = model(*input) 2022-11-23T03:12:18.5900022Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5900160Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5900534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5900690Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5901051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5901165Z _lazy_init(state, module) 2022-11-23T03:12:18.5901514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5901661Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5901995Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5902112Z return func(*args, **kwargs) 2022-11-23T03:12:18.5902485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5902569Z p_assert( 2022-11-23T03:12:18.5902901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5903023Z traceback.print_stack() 2022-11-23T03:12:18.5903145Z File "", line 1, in 2022-11-23T03:12:18.5903352Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5903493Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5903688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5903842Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5904312Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5904500Z self.run() 2022-11-23T03:12:18.5904694Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5904822Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5905171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5905318Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5905661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5905793Z getattr(self, test_name)() 2022-11-23T03:12:18.5906115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5906140Z fn() 2022-11-23T03:12:18.5906568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5906691Z test(self, **param_kwargs) 2022-11-23T03:12:18.5907041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5907167Z return func(*args, **kwargs) 2022-11-23T03:12:18.5907446Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5907542Z self.run_subtests( 2022-11-23T03:12:18.5907887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5908047Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5908402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5908709Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5908995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5909110Z output = model(*input) 2022-11-23T03:12:18.5909432Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5909567Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5909922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5910093Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5910453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5910573Z _lazy_init(state, module) 2022-11-23T03:12:18.5910918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5911064Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5911394Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5911515Z return func(*args, **kwargs) 2022-11-23T03:12:18.5911908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5911980Z p_assert( 2022-11-23T03:12:18.5912304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5912425Z traceback.print_stack() 2022-11-23T03:12:18.5912554Z File "", line 1, in 2022-11-23T03:12:18.5912754Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5912886Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5913069Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5913218Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5913422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5913519Z self.run() 2022-11-23T03:12:18.5913715Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5913861Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5914202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5914336Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5914679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5914801Z getattr(self, test_name)() 2022-11-23T03:12:18.5915154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5915249Z fn() 2022-11-23T03:12:18.5915663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5915792Z test(self, **param_kwargs) 2022-11-23T03:12:18.5916138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5916262Z return func(*args, **kwargs) 2022-11-23T03:12:18.5916520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5916624Z self.run_subtests( 2022-11-23T03:12:18.5916971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5917129Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5917487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5917697Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5918078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5918193Z output = model(*input) 2022-11-23T03:12:18.5918498Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5918633Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5919005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5919180Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5919532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5919644Z _lazy_init(state, module) 2022-11-23T03:12:18.5919988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5920133Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5920453Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5920570Z return func(*args, **kwargs) 2022-11-23T03:12:18.5920937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5921032Z p_assert( 2022-11-23T03:12:18.5921360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5921474Z traceback.print_stack() 2022-11-23T03:12:18.5921718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.5921956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.5922185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.5922424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.5922815Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5922941Z File "", line 1, in 2022-11-23T03:12:18.5923149Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5923289Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5923482Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5923624Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5923819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5923921Z self.run() 2022-11-23T03:12:18.5924124Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5924314Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5924664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5924789Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5925148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5925336Z getattr(self, test_name)() 2022-11-23T03:12:18.5925683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5925697Z fn() 2022-11-23T03:12:18.5926051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5926174Z test(self, **param_kwargs) 2022-11-23T03:12:18.5926527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5926702Z return func(*args, **kwargs) 2022-11-23T03:12:18.5926980Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5927094Z self.run_subtests( 2022-11-23T03:12:18.5927429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5927582Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5927936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5928085Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5928454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5928577Z output = model(*input) 2022-11-23T03:12:18.5928912Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5929049Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5929403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5929631Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5930007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5930057Z _lazy_init(state, module) 2022-11-23T03:12:18.5930406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5930543Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5930877Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5931005Z return func(*args, **kwargs) 2022-11-23T03:12:18.5931367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5931466Z p_assert( 2022-11-23T03:12:18.5931794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5931911Z traceback.print_stack() 2022-11-23T03:12:18.5932307Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5932702Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5933080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.5933205Z File "", line 1, in 2022-11-23T03:12:18.5933398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5933620Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5933833Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5934017Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5934185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5934288Z self.run() 2022-11-23T03:12:18.5934493Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5934632Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5934957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5935083Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5935440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5935629Z getattr(self, test_name)() 2022-11-23T03:12:18.5935989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5936086Z fn() 2022-11-23T03:12:18.5936447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5936561Z test(self, **param_kwargs) 2022-11-23T03:12:18.5936896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5937014Z return func(*args, **kwargs) 2022-11-23T03:12:18.5937323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5937399Z self.run_subtests( 2022-11-23T03:12:18.5937746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5937909Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5938272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5938419Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5938818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5938893Z output = model(*input) 2022-11-23T03:12:18.5939209Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5939340Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5939714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5939883Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5940248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5940374Z _lazy_init(state, module) 2022-11-23T03:12:18.5940712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5940852Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5941184Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5941303Z return func(*args, **kwargs) 2022-11-23T03:12:18.5941672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5941771Z p_assert( 2022-11-23T03:12:18.5942099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5942213Z traceback.print_stack() 2022-11-23T03:12:18.5942377Z File "", line 1, in 2022-11-23T03:12:18.5942587Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5942770Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5942979Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5943123Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5943326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5943433Z self.run() 2022-11-23T03:12:18.5943617Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5943758Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5944443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5944569Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5944921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5945123Z getattr(self, test_name)() 2022-11-23T03:12:18.5945485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5945567Z fn() 2022-11-23T03:12:18.5945927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5945950Z test(self, **param_kwargs) 2022-11-23T03:12:18.5946293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5946412Z return func(*args, **kwargs) 2022-11-23T03:12:18.5946684Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5946791Z self.run_subtests( 2022-11-23T03:12:18.5947134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5947296Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5947642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5947796Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5948164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5948274Z output = model(*input) 2022-11-23T03:12:18.5948597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5948750Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5949102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5949272Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5949626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5949751Z _lazy_init(state, module) 2022-11-23T03:12:18.5950090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5950226Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5950561Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5950680Z return func(*args, **kwargs) 2022-11-23T03:12:18.5951053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5951158Z p_assert( 2022-11-23T03:12:18.5951472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5951591Z traceback.print_stack() 2022-11-23T03:12:18.5951713Z File "", line 1, in 2022-11-23T03:12:18.5951981Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5952126Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5952323Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5952467Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5952676Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5952763Z self.run() 2022-11-23T03:12:18.5953009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5953101Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5953442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5953574Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5953926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5954095Z getattr(self, test_name)() 2022-11-23T03:12:18.5954434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5954525Z fn() 2022-11-23T03:12:18.5954889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5955012Z test(self, **param_kwargs) 2022-11-23T03:12:18.5955358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5955481Z return func(*args, **kwargs) 2022-11-23T03:12:18.5955750Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5955860Z self.run_subtests( 2022-11-23T03:12:18.5956192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5956357Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5956713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5956866Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5957235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5957343Z output = model(*input) 2022-11-23T03:12:18.5957657Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5957797Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5958152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5958324Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5958688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5958810Z _lazy_init(state, module) 2022-11-23T03:12:18.5959158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5959294Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5959624Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5959743Z return func(*args, **kwargs) 2022-11-23T03:12:18.5960102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5960198Z p_assert( 2022-11-23T03:12:18.5960529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5960651Z traceback.print_stack() 2022-11-23T03:12:18.5960942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.5961189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.5961421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.5961655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.5962044Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5962157Z File "", line 1, in 2022-11-23T03:12:18.5962368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5962504Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5962702Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5962897Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5963114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5963210Z self.run() 2022-11-23T03:12:18.5963394Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5963614Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5963869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5963996Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5964344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5964461Z getattr(self, test_name)() 2022-11-23T03:12:18.5964822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5964918Z fn() 2022-11-23T03:12:18.5965272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5965391Z test(self, **param_kwargs) 2022-11-23T03:12:18.5965738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5965856Z return func(*args, **kwargs) 2022-11-23T03:12:18.5966131Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5966243Z self.run_subtests( 2022-11-23T03:12:18.5966591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5966745Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5967089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5967246Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5967615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5967726Z output = model(*input) 2022-11-23T03:12:18.5968047Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5968181Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5968551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5968726Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5969073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5969197Z _lazy_init(state, module) 2022-11-23T03:12:18.5969536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5969675Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5970052Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5970208Z return func(*args, **kwargs) 2022-11-23T03:12:18.5970614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5970715Z p_assert( 2022-11-23T03:12:18.5971033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5971154Z traceback.print_stack() 2022-11-23T03:12:18.5971544Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5971930Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5972377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.5972500Z File "", line 1, in 2022-11-23T03:12:18.5972703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5972843Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5973028Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5973182Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5973395Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5973496Z self.run() 2022-11-23T03:12:18.5973788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5973925Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5974259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5974495Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5974741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5974862Z getattr(self, test_name)() 2022-11-23T03:12:18.5975221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5975320Z fn() 2022-11-23T03:12:18.5975679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5975800Z test(self, **param_kwargs) 2022-11-23T03:12:18.5976152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5976271Z return func(*args, **kwargs) 2022-11-23T03:12:18.5976528Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5976644Z self.run_subtests( 2022-11-23T03:12:18.5976991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5977149Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5977507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5977658Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5978025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5978144Z output = model(*input) 2022-11-23T03:12:18.5978449Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5978586Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5978959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5979183Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5979557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5979681Z _lazy_init(state, module) 2022-11-23T03:12:18.5980033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5980182Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5980498Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5980628Z return func(*args, **kwargs) 2022-11-23T03:12:18.5981102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5981115Z p_assert( 2022-11-23T03:12:18.5981526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5981657Z traceback.print_stack() 2022-11-23T03:12:18.5981792Z File "", line 1, in 2022-11-23T03:12:18.5982004Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5982129Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5982334Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5982574Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5982707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5982814Z self.run() 2022-11-23T03:12:18.5983021Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5983172Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5983490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5983633Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5984315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5984461Z getattr(self, test_name)() 2022-11-23T03:12:18.5984846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5984968Z fn() 2022-11-23T03:12:18.5985333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5985441Z test(self, **param_kwargs) 2022-11-23T03:12:18.5985787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5985912Z return func(*args, **kwargs) 2022-11-23T03:12:18.5986206Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5986317Z self.run_subtests( 2022-11-23T03:12:18.5986589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5986754Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5987120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5987275Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5987627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5987749Z output = model(*input) 2022-11-23T03:12:18.5988076Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5988280Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5988744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5988932Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5989303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5989427Z _lazy_init(state, module) 2022-11-23T03:12:18.5989754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5989901Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5990235Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5990360Z return func(*args, **kwargs) 2022-11-23T03:12:18.5990743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.5990960Z p_assert( 2022-11-23T03:12:18.5991304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.5991432Z traceback.print_stack() 2022-11-23T03:12:18.5991543Z File "", line 1, in 2022-11-23T03:12:18.5991755Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.5991899Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.5992104Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.5992259Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.5992473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.5992580Z self.run() 2022-11-23T03:12:18.5992785Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.5992911Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.5993263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.5993405Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.5993771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.5993896Z getattr(self, test_name)() 2022-11-23T03:12:18.5994257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.5994358Z fn() 2022-11-23T03:12:18.5994723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.5994828Z test(self, **param_kwargs) 2022-11-23T03:12:18.5995187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.5995318Z return func(*args, **kwargs) 2022-11-23T03:12:18.5995646Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.5995723Z self.run_subtests( 2022-11-23T03:12:18.5996078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.5996244Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.5996608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.5996742Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.5997121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.5997249Z output = model(*input) 2022-11-23T03:12:18.5997574Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.5997717Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.5998151Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.5998338Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.5998712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.5998815Z _lazy_init(state, module) 2022-11-23T03:12:18.5999169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.5999319Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.5999657Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.5999782Z return func(*args, **kwargs) 2022-11-23T03:12:18.6000162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6000323Z p_assert( 2022-11-23T03:12:18.6000665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6000773Z traceback.print_stack() 2022-11-23T03:12:18.6001024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.6001274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.6001518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.6001760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.6002159Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6002292Z File "", line 1, in 2022-11-23T03:12:18.6002505Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6002636Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6002844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6002997Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6003215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6003323Z self.run() 2022-11-23T03:12:18.6003529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6003676Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6004020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6004136Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6004501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6004634Z getattr(self, test_name)() 2022-11-23T03:12:18.6004999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6005099Z fn() 2022-11-23T03:12:18.6005465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6005592Z test(self, **param_kwargs) 2022-11-23T03:12:18.6006008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6006057Z return func(*args, **kwargs) 2022-11-23T03:12:18.6006341Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6006554Z self.run_subtests( 2022-11-23T03:12:18.6006899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6007081Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6007406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6007569Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6007944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6008046Z output = model(*input) 2022-11-23T03:12:18.6008374Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6008521Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6008897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6009074Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6009439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6017035Z _lazy_init(state, module) 2022-11-23T03:12:18.6017500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6017655Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6018038Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6018141Z return func(*args, **kwargs) 2022-11-23T03:12:18.6018573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6018609Z p_assert( 2022-11-23T03:12:18.6019043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6019112Z traceback.print_stack() 2022-11-23T03:12:18.6019469Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6019876Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6020250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6020384Z File "", line 1, in 2022-11-23T03:12:18.6020591Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6020794Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6021006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6021154Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6021366Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6021476Z self.run() 2022-11-23T03:12:18.6021665Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6021816Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6022156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6022294Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6022654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6022772Z getattr(self, test_name)() 2022-11-23T03:12:18.6023131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6023222Z fn() 2022-11-23T03:12:18.6023569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6023690Z test(self, **param_kwargs) 2022-11-23T03:12:18.6024304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6024604Z return func(*args, **kwargs) 2022-11-23T03:12:18.6024931Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6025077Z self.run_subtests( 2022-11-23T03:12:18.6025442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6025590Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6025954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6026109Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6026480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6026602Z output = model(*input) 2022-11-23T03:12:18.6026997Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6027046Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6027423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6027601Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6027950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6028074Z _lazy_init(state, module) 2022-11-23T03:12:18.6028424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6028564Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6028898Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6029025Z return func(*args, **kwargs) 2022-11-23T03:12:18.6029407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6029510Z p_assert( 2022-11-23T03:12:18.6029829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6029950Z traceback.print_stack() 2022-11-23T03:12:18.6030172Z File "", line 1, in 2022-11-23T03:12:18.6030395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6030519Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6030717Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6030863Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6031057Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6031160Z self.run() 2022-11-23T03:12:18.6031370Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6031514Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6031848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6031974Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6032330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6032447Z getattr(self, test_name)() 2022-11-23T03:12:18.6032789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6032889Z fn() 2022-11-23T03:12:18.6033253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6033374Z test(self, **param_kwargs) 2022-11-23T03:12:18.6033773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6033897Z return func(*args, **kwargs) 2022-11-23T03:12:18.6034177Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6034289Z self.run_subtests( 2022-11-23T03:12:18.6034628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6034785Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6035148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6035299Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6035667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6035831Z output = model(*input) 2022-11-23T03:12:18.6036159Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6036297Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6036654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6036829Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6037191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6037310Z _lazy_init(state, module) 2022-11-23T03:12:18.6037658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6037845Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6038141Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6038269Z return func(*args, **kwargs) 2022-11-23T03:12:18.6038632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6038734Z p_assert( 2022-11-23T03:12:18.6039165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6039197Z traceback.print_stack() 2022-11-23T03:12:18.6039327Z File "", line 1, in 2022-11-23T03:12:18.6039532Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6039669Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6039867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6040000Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6040210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6040318Z self.run() 2022-11-23T03:12:18.6040518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6040659Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6041001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6041135Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6041508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6041602Z getattr(self, test_name)() 2022-11-23T03:12:18.6041958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6042052Z fn() 2022-11-23T03:12:18.6042478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6042600Z test(self, **param_kwargs) 2022-11-23T03:12:18.6043011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6043141Z return func(*args, **kwargs) 2022-11-23T03:12:18.6043399Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6043508Z self.run_subtests( 2022-11-23T03:12:18.6043856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6044018Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6044381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6044524Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6044889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6045047Z output = model(*input) 2022-11-23T03:12:18.6045360Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6045494Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6045962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6046045Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6046399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6046513Z _lazy_init(state, module) 2022-11-23T03:12:18.6046856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6046998Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6047315Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6047448Z return func(*args, **kwargs) 2022-11-23T03:12:18.6047820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6047918Z p_assert( 2022-11-23T03:12:18.6048251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6048433Z traceback.print_stack() 2022-11-23T03:12:18.6048618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.6048856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.6049154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.6049297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.6049698Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6049826Z File "", line 1, in 2022-11-23T03:12:18.6050027Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6050168Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6050368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6050520Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6050713Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6050815Z self.run() 2022-11-23T03:12:18.6051017Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6051160Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6051500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6051687Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6052056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6052180Z getattr(self, test_name)() 2022-11-23T03:12:18.6052518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6052614Z fn() 2022-11-23T03:12:18.6052978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6053100Z test(self, **param_kwargs) 2022-11-23T03:12:18.6053446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6053563Z return func(*args, **kwargs) 2022-11-23T03:12:18.6053898Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6054011Z self.run_subtests( 2022-11-23T03:12:18.6054347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6054505Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6054866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6055012Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6055380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6055493Z output = model(*input) 2022-11-23T03:12:18.6055817Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6055958Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6056318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6056490Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6056849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6056966Z _lazy_init(state, module) 2022-11-23T03:12:18.6057309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6057447Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6057778Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6057923Z return func(*args, **kwargs) 2022-11-23T03:12:18.6058255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6058350Z p_assert( 2022-11-23T03:12:18.6058684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6058804Z traceback.print_stack() 2022-11-23T03:12:18.6059198Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6059585Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6059974Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6060198Z File "", line 1, in 2022-11-23T03:12:18.6060351Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6060560Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6060781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6060907Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6061166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6061276Z self.run() 2022-11-23T03:12:18.6061476Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6061620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6061943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6062069Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6062427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6062550Z getattr(self, test_name)() 2022-11-23T03:12:18.6062907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6063000Z fn() 2022-11-23T03:12:18.6063410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6063526Z test(self, **param_kwargs) 2022-11-23T03:12:18.6064303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6064398Z return func(*args, **kwargs) 2022-11-23T03:12:18.6064760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6064886Z self.run_subtests( 2022-11-23T03:12:18.6065239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6065392Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6065756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6065899Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6066174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6066292Z output = model(*input) 2022-11-23T03:12:18.6066612Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6066756Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6067131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6067305Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6067662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6067783Z _lazy_init(state, module) 2022-11-23T03:12:18.6068113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6068256Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6068588Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6068707Z return func(*args, **kwargs) 2022-11-23T03:12:18.6069080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6069184Z p_assert( 2022-11-23T03:12:18.6069514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6069632Z traceback.print_stack() 2022-11-23T03:12:18.6069744Z File "", line 1, in 2022-11-23T03:12:18.6069946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6070085Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6070218Z File "", line 1, in 2022-11-23T03:12:18.6070459Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6070712Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6070904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6070989Z self.run() 2022-11-23T03:12:18.6071196Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6071414Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6071643Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6071850Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6072067Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6072211Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6072545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6072714Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6072840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6073017Z self.run() 2022-11-23T03:12:18.6073313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6073420Z getattr(self, test_name)() 2022-11-23T03:12:18.6073718Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6073769Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6074125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6074207Z fn() 2022-11-23T03:12:18.6074539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6074665Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6075028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6075150Z test(self, **param_kwargs) 2022-11-23T03:12:18.6075501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6075622Z getattr(self, test_name)() 2022-11-23T03:12:18.6075958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6076084Z return func(*args, **kwargs) 2022-11-23T03:12:18.6076444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6076538Z fn() 2022-11-23T03:12:18.6076809Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6076937Z self.run_subtests( 2022-11-23T03:12:18.6077285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6077404Z test(self, **param_kwargs) 2022-11-23T03:12:18.6077736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6077887Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6078231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6078356Z return func(*args, **kwargs) 2022-11-23T03:12:18.6078714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6078862Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6079132Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6079244Z self.run_subtests( 2022-11-23T03:12:18.6079647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6079769Z output = model(*input) 2022-11-23T03:12:18.6080121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6080277Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6080600Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6080738Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6081094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6081244Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6081615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6081863Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6082237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6082359Z output = model(*input) 2022-11-23T03:12:18.6082715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6082904Z _lazy_init(state, module) 2022-11-23T03:12:18.6083148Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6083284Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6083636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6083761Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6084130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6084308Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6084638Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6084757Z return func(*args, **kwargs) 2022-11-23T03:12:18.6085115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6085228Z _lazy_init(state, module) 2022-11-23T03:12:18.6085599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6085683Z p_assert( 2022-11-23T03:12:18.6086025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6086169Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6086507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6086710Z traceback.print_stack() 2022-11-23T03:12:18.6086961Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6087076Z return func(*args, **kwargs) 2022-11-23T03:12:18.6087438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6087616Z p_assert( 2022-11-23T03:12:18.6087852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6087976Z traceback.print_stack() 2022-11-23T03:12:18.6088219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.6088455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.6088728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.6088960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.6089349Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6089460Z File "", line 1, in 2022-11-23T03:12:18.6089664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6089798Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6090000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6090145Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6090348Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6090449Z self.run() 2022-11-23T03:12:18.6090678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6090830Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6091164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6091289Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6091650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6091777Z getattr(self, test_name)() 2022-11-23T03:12:18.6092128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6092226Z fn() 2022-11-23T03:12:18.6092570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6092687Z test(self, **param_kwargs) 2022-11-23T03:12:18.6093035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6093161Z return func(*args, **kwargs) 2022-11-23T03:12:18.6093463Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6093544Z self.run_subtests( 2022-11-23T03:12:18.6093894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6094054Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6094398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6094545Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6094919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6095043Z output = model(*input) 2022-11-23T03:12:18.6095369Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6095510Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6095882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6096055Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6096417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6096520Z _lazy_init(state, module) 2022-11-23T03:12:18.6096871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6097011Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6097346Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6097470Z return func(*args, **kwargs) 2022-11-23T03:12:18.6097889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6097990Z p_assert( 2022-11-23T03:12:18.6098310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6098432Z traceback.print_stack() 2022-11-23T03:12:18.6098829Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6099223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6099613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6099746Z File "", line 1, in 2022-11-23T03:12:18.6100000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6100146Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6100349Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6100482Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6100687Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6100790Z self.run() 2022-11-23T03:12:18.6100983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6101122Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6101461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6101601Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6101941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6102067Z getattr(self, test_name)() 2022-11-23T03:12:18.6102423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6102525Z fn() 2022-11-23T03:12:18.6102892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6103006Z test(self, **param_kwargs) 2022-11-23T03:12:18.6103356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6103473Z return func(*args, **kwargs) 2022-11-23T03:12:18.6103730Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6103839Z self.run_subtests( 2022-11-23T03:12:18.6104550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6104721Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6105077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6105247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6105619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6105723Z output = model(*input) 2022-11-23T03:12:18.6105943Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6106082Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6106457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6106623Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6107054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6107177Z _lazy_init(state, module) 2022-11-23T03:12:18.6107522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6107659Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6107976Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6108100Z return func(*args, **kwargs) 2022-11-23T03:12:18.6108470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6108566Z p_assert( 2022-11-23T03:12:18.6108894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6109012Z traceback.print_stack() 2022-11-23T03:12:18.6109135Z File "", line 1, in 2022-11-23T03:12:18.6109408Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6109632Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6109732Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6109880Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6110088Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6110188Z self.run() 2022-11-23T03:12:18.6110387Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6110526Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6110864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6111037Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6111338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6111459Z getattr(self, test_name)() 2022-11-23T03:12:18.6111813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6111902Z fn() 2022-11-23T03:12:18.6112259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6112373Z test(self, **param_kwargs) 2022-11-23T03:12:18.6112706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6112826Z return func(*args, **kwargs) 2022-11-23T03:12:18.6113096Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6113205Z self.run_subtests( 2022-11-23T03:12:18.6113553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6113714Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6114070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6114215Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6114583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6114684Z output = model(*input) 2022-11-23T03:12:18.6115004Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6115143Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6115508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6115676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6116082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6116206Z _lazy_init(state, module) 2022-11-23T03:12:18.6116549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6116674Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6117003Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6117120Z return func(*args, **kwargs) 2022-11-23T03:12:18.6117487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6117589Z p_assert( 2022-11-23T03:12:18.6117982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6118049Z traceback.print_stack() 2022-11-23T03:12:18.6118228Z File "", line 1, in 2022-11-23T03:12:18.6118434Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6118652Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6118779Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6118930Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6119134Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6119232Z self.run() 2022-11-23T03:12:18.6119430Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6119556Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6119895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6120027Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6120384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6120512Z getattr(self, test_name)() 2022-11-23T03:12:18.6120872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6120967Z fn() 2022-11-23T03:12:18.6121325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6121428Z test(self, **param_kwargs) 2022-11-23T03:12:18.6121779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6121899Z return func(*args, **kwargs) 2022-11-23T03:12:18.6122168Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6122277Z self.run_subtests( 2022-11-23T03:12:18.6122618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6122778Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6123132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6123266Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6123631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6123739Z output = model(*input) 2022-11-23T03:12:18.6124058Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6124194Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6124561Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6124728Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6125155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6125264Z _lazy_init(state, module) 2022-11-23T03:12:18.6125607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6125763Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6126078Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6126201Z return func(*args, **kwargs) 2022-11-23T03:12:18.6126577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6126678Z p_assert( 2022-11-23T03:12:18.6127009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6127166Z traceback.print_stack() 2022-11-23T03:12:18.6127410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.6127649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.6127879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.6128108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.6128504Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6128625Z File "", line 1, in 2022-11-23T03:12:18.6128827Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6128951Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6129149Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6129298Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6129512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6129606Z self.run() 2022-11-23T03:12:18.6129804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6129947Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6130280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6130538Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6130896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6131018Z getattr(self, test_name)() 2022-11-23T03:12:18.6131370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6131464Z fn() 2022-11-23T03:12:18.6131831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6131943Z test(self, **param_kwargs) 2022-11-23T03:12:18.6132289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6132395Z return func(*args, **kwargs) 2022-11-23T03:12:18.6132663Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6132770Z self.run_subtests( 2022-11-23T03:12:18.6133126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6133284Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6133638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6133790Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6134202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6134310Z output = model(*input) 2022-11-23T03:12:18.6134695Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6134777Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6135144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6135314Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6135668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6135788Z _lazy_init(state, module) 2022-11-23T03:12:18.6136134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6136310Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6136642Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6136768Z return func(*args, **kwargs) 2022-11-23T03:12:18.6137141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6137241Z p_assert( 2022-11-23T03:12:18.6137574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6137699Z traceback.print_stack() 2022-11-23T03:12:18.6138192Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6138470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6138866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6138999Z File "", line 1, in 2022-11-23T03:12:18.6139206Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6139345Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6139549Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6139691Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6139904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6140073Z self.run() 2022-11-23T03:12:18.6140185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6140320Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6140654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6140795Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6141157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6141280Z getattr(self, test_name)() 2022-11-23T03:12:18.6141636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6141717Z fn() 2022-11-23T03:12:18.6142078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6142197Z test(self, **param_kwargs) 2022-11-23T03:12:18.6142604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6142725Z return func(*args, **kwargs) 2022-11-23T03:12:18.6142997Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6143161Z self.run_subtests( 2022-11-23T03:12:18.6143526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6143670Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6143797Z File "", line 1, in 2022-11-23T03:12:18.6144501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6144644Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6144842Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6144992Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6145384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6145560Z output = model(*input) 2022-11-23T03:12:18.6145773Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6145906Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6146246Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6146363Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6146563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6146657Z self.run() 2022-11-23T03:12:18.6147074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6147190Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6147431Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6147580Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6147956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6148071Z _lazy_init(state, module) 2022-11-23T03:12:18.6148397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6148524Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6148864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6148988Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6149338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6149477Z getattr(self, test_name)() 2022-11-23T03:12:18.6149783Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6149908Z return func(*args, **kwargs) 2022-11-23T03:12:18.6150271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6150361Z fn() 2022-11-23T03:12:18.6150736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6150820Z p_assert( 2022-11-23T03:12:18.6151179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6151294Z test(self, **param_kwargs) 2022-11-23T03:12:18.6151619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6151747Z traceback.print_stack() 2022-11-23T03:12:18.6152095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6152209Z return func(*args, **kwargs) 2022-11-23T03:12:18.6152542Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6152652Z self.run_subtests( 2022-11-23T03:12:18.6153006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6153160Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6153517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6153737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6154030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6154144Z output = model(*input) 2022-11-23T03:12:18.6154457Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6154579Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6155002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6155173Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6155533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6155646Z _lazy_init(state, module) 2022-11-23T03:12:18.6155993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6156131Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6156460Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6156565Z return func(*args, **kwargs) 2022-11-23T03:12:18.6156934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6157036Z p_assert( 2022-11-23T03:12:18.6157367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6157484Z traceback.print_stack() 2022-11-23T03:12:18.6157607Z File "", line 1, in 2022-11-23T03:12:18.6157812Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6157950Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6158132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6158358Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6158488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6158584Z self.run() 2022-11-23T03:12:18.6158780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6158922Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6159264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6159379Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6159736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6159850Z getattr(self, test_name)() 2022-11-23T03:12:18.6160205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6160298Z fn() 2022-11-23T03:12:18.6160660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6160783Z test(self, **param_kwargs) 2022-11-23T03:12:18.6161137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6161242Z return func(*args, **kwargs) 2022-11-23T03:12:18.6161565Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6161682Z self.run_subtests( 2022-11-23T03:12:18.6162031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6162193Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6162549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6162698Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6163066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6163193Z output = model(*input) 2022-11-23T03:12:18.6163485Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6163667Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6164042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6164209Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6164568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6164685Z _lazy_init(state, module) 2022-11-23T03:12:18.6165029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6165153Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6165482Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6165600Z return func(*args, **kwargs) 2022-11-23T03:12:18.6165974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6166071Z p_assert( 2022-11-23T03:12:18.6166399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6166523Z traceback.print_stack() 2022-11-23T03:12:18.6166770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.6166988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.6167214Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.6167437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.6167827Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6167947Z File "", line 1, in 2022-11-23T03:12:18.6168160Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6168294Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6168492Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6168624Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6168837Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6168934Z self.run() 2022-11-23T03:12:18.6169131Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6169270Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6169603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6169729Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6170084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6170193Z getattr(self, test_name)() 2022-11-23T03:12:18.6170667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6170770Z fn() 2022-11-23T03:12:18.6171173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6171259Z test(self, **param_kwargs) 2022-11-23T03:12:18.6171600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6171805Z return func(*args, **kwargs) 2022-11-23T03:12:18.6171988Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6172082Z self.run_subtests( 2022-11-23T03:12:18.6172426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6172639Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6172999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6173146Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6173514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6173719Z output = model(*input) 2022-11-23T03:12:18.6173949Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6174071Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6174442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6174610Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6174979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6175096Z _lazy_init(state, module) 2022-11-23T03:12:18.6175440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6175577Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6175911Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6176018Z return func(*args, **kwargs) 2022-11-23T03:12:18.6176391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6176490Z p_assert( 2022-11-23T03:12:18.6176822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6176943Z traceback.print_stack() 2022-11-23T03:12:18.6177341Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6177736Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6178127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6178258Z File "", line 1, in 2022-11-23T03:12:18.6178454Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6178690Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6178787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6178929Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6179135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6179235Z self.run() 2022-11-23T03:12:18.6179560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6179616Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6179956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6180085Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6180437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6180553Z getattr(self, test_name)() 2022-11-23T03:12:18.6180902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6180997Z fn() 2022-11-23T03:12:18.6181361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6181464Z test(self, **param_kwargs) 2022-11-23T03:12:18.6181945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6182164Z return func(*args, **kwargs) 2022-11-23T03:12:18.6182348Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6182455Z self.run_subtests( 2022-11-23T03:12:18.6182802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6182954Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6183310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6183443Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6183809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6184228Z output = model(*input) 2022-11-23T03:12:18.6184612Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6184785Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6185140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6185329Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6185661Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6185796Z _lazy_init(state, module) 2022-11-23T03:12:18.6186138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6186269Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6186596Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6186723Z return func(*args, **kwargs) 2022-11-23T03:12:18.6187012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6187140Z p_assert( 2022-11-23T03:12:18.6187444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6187552Z traceback.print_stack() 2022-11-23T03:12:18.6187674Z File "", line 1, in 2022-11-23T03:12:18.6187967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6188052Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6188216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6188360Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6188561Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6188659Z self.run() 2022-11-23T03:12:18.6188918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6189147Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6189408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6189537Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6189889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6190006Z getattr(self, test_name)() 2022-11-23T03:12:18.6190362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6190443Z fn() 2022-11-23T03:12:18.6190799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6190916Z test(self, **param_kwargs) 2022-11-23T03:12:18.6191336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6191454Z return func(*args, **kwargs) 2022-11-23T03:12:18.6191727Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6191840Z self.run_subtests( 2022-11-23T03:12:18.6192181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6192325Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6192678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6192828Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6193191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6193303Z output = model(*input) 2022-11-23T03:12:18.6193626Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6193839Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6194121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6194278Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6194637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6194753Z _lazy_init(state, module) 2022-11-23T03:12:18.6195102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6195241Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6195568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6195694Z return func(*args, **kwargs) 2022-11-23T03:12:18.6196062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6196145Z p_assert( 2022-11-23T03:12:18.6196473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6196592Z traceback.print_stack() 2022-11-23T03:12:18.6196713Z File "", line 1, in 2022-11-23T03:12:18.6196915Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6197053Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6197250Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6197391Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6197586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6197686Z self.run() 2022-11-23T03:12:18.6197935Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6198080Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6198412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6198537Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6198887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6199007Z getattr(self, test_name)() 2022-11-23T03:12:18.6199346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6199444Z fn() 2022-11-23T03:12:18.6199803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6199975Z test(self, **param_kwargs) 2022-11-23T03:12:18.6200335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6200460Z return func(*args, **kwargs) 2022-11-23T03:12:18.6200733Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6200841Z self.run_subtests( 2022-11-23T03:12:18.6201175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6201329Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6201683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6201829Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6202191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6202316Z output = model(*input) 2022-11-23T03:12:18.6202635Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6202773Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6203129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6203297Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6203653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6203765Z _lazy_init(state, module) 2022-11-23T03:12:18.6204103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6204237Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6204570Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6204691Z return func(*args, **kwargs) 2022-11-23T03:12:18.6205051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6205146Z p_assert( 2022-11-23T03:12:18.6205474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6205599Z traceback.print_stack() 2022-11-23T03:12:18.6205841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.6206076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.6206297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.6206519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.6206949Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6207357Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6207754Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6208139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6208368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.6208597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.6208819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.6209097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.6209482Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6209851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6210237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6210614Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6210844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.6211070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.6211301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.6211521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.6211910Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6212299Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6212685Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6213053Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6213289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.6213522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.6213748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.6214132Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6214365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.6214744Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6215124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6215503Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6215766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.6216070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.6216300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.6216688Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6216916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.6217296Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6217681Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6218061Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6218868Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6219612Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6220348Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6221090Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6221333Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.6221549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.6221778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.6221998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.6222391Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6222786Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6223178Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6223562Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6223796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.6224272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.6224493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.6224988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6225327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.6225716Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6226114Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6226405Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6226641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.6226869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.6227086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.6227371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.6227745Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6228123Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6228506Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6228881Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6229109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.6229334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.6229564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.6229946Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6230176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.6230540Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6230920Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6231297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6231526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.6231760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.6231985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.6232208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.6232597Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6232978Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6233358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6233724Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6233950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.6234221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.6234452Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.6234837Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6235080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.6235461Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6235846Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6236217Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6237009Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6237731Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6238464Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6238708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.6238940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.6239167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.6239385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.6239779Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6240263Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6240553Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6240953Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6241184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.6241394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.6241616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.6241838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.6242217Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6242654Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6243093Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6243489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6243725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.6243953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.6244163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.6244382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.6244764Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6245140Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6245585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6245958Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6246181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.6246401Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.6246621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.6246996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6247218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.6247608Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6247990Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6248372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6248601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.6248827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.6249046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.6249478Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6249662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.6250034Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6250406Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6250779Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6251001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.6251222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.6251441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.6251867Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6252107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.6252495Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6252875Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6253241Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6253979Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6254764Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6255500Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6256235Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6256479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.6256714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.6256940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.6257166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.6257551Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6257935Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6258317Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6258744Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6258924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.6259144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.6259365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.6259746Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6259981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.6260362Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6260796Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6261185Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6261413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.6261625Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.6261844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.6262225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6262605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6262839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.6263271Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6263660Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6264726Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6265506Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6266222Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6266483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.6266719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.6266843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.6267232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6267615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6267855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.6268238Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6268717Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6268855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.6269078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.6269298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.6269667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6269986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.6270430Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6270818Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6271201Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6271435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.6271675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.6271883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.6272344Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6272656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.6272944Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6273325Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6273706Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6274476Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6275212Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6275949Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6276187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.6276424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.6276652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.6277050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6277280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.6277669Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6278039Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6278423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6278530Z dist init r=0, world=4 2022-11-23T03:12:18.6278854Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6279217Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6279532Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6279831Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6280127Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6280421Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6280770Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6281062Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6281343Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6281638Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6281933Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6282322Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6282433Z dist init r=1, world=4 2022-11-23T03:12:18.6282747Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6283055Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6283355Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6283654Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6283948Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6284248Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6284531Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6284829Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6285118Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6285413Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6285754Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6286054Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6286156Z dist init r=3, world=4 2022-11-23T03:12:18.6286476Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6286783Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6287085Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6287522Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6287714Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6288005Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6288301Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6288662Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6288966Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6289266Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6289559Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6289859Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6289971Z dist init r=2, world=4 2022-11-23T03:12:18.6290296Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6290608Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6290914Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6291203Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6291497Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6291798Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6292094Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6292447Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6292749Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6293043Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6293331Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6293629Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6293781Z ok (28.273s) 2022-11-23T03:12:18.6294145Z test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21731 2022-11-23T03:12:18.6294348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21732 2022-11-23T03:12:18.6294567Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21733 2022-11-23T03:12:18.6294777Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21734 2022-11-23T03:12:18.6295152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6295324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6295697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6295885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6296256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6296414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6296790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6296976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6297337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6297504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6297881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6298071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6298433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6298609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6298964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6299149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6299393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.6299631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.6299925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.6300104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.6300548Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6300948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6301379Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6301697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6301923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.6302146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.6302369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.6302596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.6303693Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6303803Z warnings.warn( 2022-11-23T03:12:18.6304408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:12:18.6305509Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6305632Z warnings.warn( 2022-11-23T03:12:18.6306538Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6306705Z warnings.warn( 2022-11-23T03:12:18.6307711Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6307812Z warnings.warn( 2022-11-23T03:12:18.6308037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:12:18.6308276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:12:18.6308507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:12:18.6308906Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.6309285Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.6309409Z File "", line 1, in 2022-11-23T03:12:18.6309694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6309831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6310036Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6310176Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6310470Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6310486Z self.run() 2022-11-23T03:12:18.6310675Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6310819Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6311145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6311280Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6311636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6311852Z getattr(self, test_name)() 2022-11-23T03:12:18.6312181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6312274Z fn() 2022-11-23T03:12:18.6312631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6312751Z test(self, **param_kwargs) 2022-11-23T03:12:18.6313085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6313210Z return func(*args, **kwargs) 2022-11-23T03:12:18.6313486Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6313602Z self.run_subtests( 2022-11-23T03:12:18.6313953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6314117Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6314475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6314619Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6314988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6315089Z output = model(*input) 2022-11-23T03:12:18.6315415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6315553Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6315922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6316090Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6316458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6316575Z _lazy_init(state, module) 2022-11-23T03:12:18.6316919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6317044Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6317373Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6317487Z return func(*args, **kwargs) 2022-11-23T03:12:18.6317855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6317949Z p_assert( 2022-11-23T03:12:18.6318274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6318391Z traceback.print_stack() 2022-11-23T03:12:18.6318505Z File "", line 1, in 2022-11-23T03:12:18.6318755Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6318973Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6319089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6319261Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6319447Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6319540Z self.run() 2022-11-23T03:12:18.6319741Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6319868Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6320202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6320333Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6320785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6320915Z getattr(self, test_name)() 2022-11-23T03:12:18.6321272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6321363Z fn() 2022-11-23T03:12:18.6321725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6321829Z test(self, **param_kwargs) 2022-11-23T03:12:18.6322179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6322299Z return func(*args, **kwargs) 2022-11-23T03:12:18.6322569Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6322677Z self.run_subtests( 2022-11-23T03:12:18.6323029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6323186Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6323539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6323675Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6324113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6324259Z output = model(*input) 2022-11-23T03:12:18.6324582Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6324728Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6325109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6325391Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6325749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6325832Z _lazy_init(state, module) 2022-11-23T03:12:18.6326101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6326248Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6326584Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6326715Z return func(*args, **kwargs) 2022-11-23T03:12:18.6327143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6327188Z p_assert( 2022-11-23T03:12:18.6327515Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6327627Z traceback.print_stack() 2022-11-23T03:12:18.6328118Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.6328521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:12:18.6328679Z File "", line 1, in 2022-11-23T03:12:18.6328888Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6329026Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6329225Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6329375Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6329567Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6329667Z self.run() 2022-11-23T03:12:18.6329859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6330055Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6330395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6330524Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6330889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6331010Z getattr(self, test_name)() 2022-11-23T03:12:18.6331350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6331442Z fn() 2022-11-23T03:12:18.6331801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6331916Z test(self, **param_kwargs) 2022-11-23T03:12:18.6332274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6332404Z return func(*args, **kwargs) 2022-11-23T03:12:18.6332674Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6332784Z self.run_subtests( 2022-11-23T03:12:18.6333118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6333285Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6333679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6333834Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6334203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6334322Z output = model(*input) 2022-11-23T03:12:18.6334639Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6334790Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6335160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6335317Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6335683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6335795Z _lazy_init(state, module) 2022-11-23T03:12:18.6336149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6336285Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6336613Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6336737Z return func(*args, **kwargs) 2022-11-23T03:12:18.6337164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6337258Z p_assert( 2022-11-23T03:12:18.6337589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6337706Z traceback.print_stack() 2022-11-23T03:12:18.6337836Z File "", line 1, in 2022-11-23T03:12:18.6338037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6338170Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6338371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6338522Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6338716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6338810Z self.run() 2022-11-23T03:12:18.6339061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6339209Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6339545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6339673Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6340027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6340131Z getattr(self, test_name)() 2022-11-23T03:12:18.6340483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6340626Z fn() 2022-11-23T03:12:18.6340983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6341063Z test(self, **param_kwargs) 2022-11-23T03:12:18.6341407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6341532Z return func(*args, **kwargs) 2022-11-23T03:12:18.6341840Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6341899Z self.run_subtests( 2022-11-23T03:12:18.6342247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6342407Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6342833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6342982Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6343349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6343464Z output = model(*input) 2022-11-23T03:12:18.6343788Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6344226Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6344724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6344905Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6345261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6345374Z _lazy_init(state, module) 2022-11-23T03:12:18.6345714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6345867Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6346198Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6346294Z return func(*args, **kwargs) 2022-11-23T03:12:18.6346651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6346760Z p_assert( 2022-11-23T03:12:18.6347095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6347221Z traceback.print_stack() 2022-11-23T03:12:18.6347457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:12:18.6347691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:12:18.6347925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:12:18.6348164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:12:18.6348544Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.6348740Z File "", line 1, in 2022-11-23T03:12:18.6348949Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6349089Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6349285Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6349428Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6349633Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6349719Z self.run() 2022-11-23T03:12:18.6349915Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6350059Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6350399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6350536Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6350901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6351022Z getattr(self, test_name)() 2022-11-23T03:12:18.6351373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6351451Z fn() 2022-11-23T03:12:18.6351809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6351926Z test(self, **param_kwargs) 2022-11-23T03:12:18.6352275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6352393Z return func(*args, **kwargs) 2022-11-23T03:12:18.6352667Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6352781Z self.run_subtests( 2022-11-23T03:12:18.6353132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6353276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6353626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6353776Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6354143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6354263Z output = model(*input) 2022-11-23T03:12:18.6354578Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6354711Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6355080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6355286Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6355658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6355773Z _lazy_init(state, module) 2022-11-23T03:12:18.6356117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6356257Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6356583Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6356705Z return func(*args, **kwargs) 2022-11-23T03:12:18.6357076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6357160Z p_assert( 2022-11-23T03:12:18.6357495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6357709Z traceback.print_stack() 2022-11-23T03:12:18.6358110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.6358498Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.6358888Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:12:18.6359011Z File "", line 1, in 2022-11-23T03:12:18.6359219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6359356Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6359542Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6359684Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6359892Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6359999Z self.run() 2022-11-23T03:12:18.6360202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6360336Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6360663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6360776Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6361129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6361243Z getattr(self, test_name)() 2022-11-23T03:12:18.6361599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6361685Z fn() 2022-11-23T03:12:18.6362040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6362162Z test(self, **param_kwargs) 2022-11-23T03:12:18.6362513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6362618Z return func(*args, **kwargs) 2022-11-23T03:12:18.6362893Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6363003Z self.run_subtests( 2022-11-23T03:12:18.6363352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6363510Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6363868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6364020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6364433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6364540Z output = model(*input) 2022-11-23T03:12:18.6364864Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6365002Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6365374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6365542Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6365907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6366024Z _lazy_init(state, module) 2022-11-23T03:12:18.6366367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6366491Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6366891Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6367015Z return func(*args, **kwargs) 2022-11-23T03:12:18.6367381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6367476Z p_assert( 2022-11-23T03:12:18.6367802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6367922Z traceback.print_stack() 2022-11-23T03:12:18.6368048Z File "", line 1, in 2022-11-23T03:12:18.6368236Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6368371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6368570Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6368718Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6368935Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6369030Z self.run() 2022-11-23T03:12:18.6369235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6369362Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6369698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6369824Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6370187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6370302Z getattr(self, test_name)() 2022-11-23T03:12:18.6370721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6370812Z fn() 2022-11-23T03:12:18.6371167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6371279Z test(self, **param_kwargs) 2022-11-23T03:12:18.6371625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6371742Z return func(*args, **kwargs) 2022-11-23T03:12:18.6372086Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6372119Z self.run_subtests( 2022-11-23T03:12:18.6372462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6372621Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6372974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6373170Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6373531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6373650Z output = model(*input) 2022-11-23T03:12:18.6373981Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6374121Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6374501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6374758Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6375115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6375141Z _lazy_init(state, module) 2022-11-23T03:12:18.6375471Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6375657Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6375999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6376113Z return func(*args, **kwargs) 2022-11-23T03:12:18.6376489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6376595Z p_assert( 2022-11-23T03:12:18.6376923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6377029Z traceback.print_stack() 2022-11-23T03:12:18.6377245Z File "", line 1, in 2022-11-23T03:12:18.6377359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6377500Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6377703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6377856Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6378066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6378169Z self.run() 2022-11-23T03:12:18.6378354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6378498Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6378835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6378961Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6379310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6379426Z getattr(self, test_name)() 2022-11-23T03:12:18.6379779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6379874Z fn() 2022-11-23T03:12:18.6380226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6380342Z test(self, **param_kwargs) 2022-11-23T03:12:18.6380693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6380815Z return func(*args, **kwargs) 2022-11-23T03:12:18.6381085Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6381196Z self.run_subtests( 2022-11-23T03:12:18.6381541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6381699Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6382043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6382191Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6382624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6382729Z output = model(*input) 2022-11-23T03:12:18.6383055Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6383186Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6383552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6383726Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6384425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6384567Z _lazy_init(state, module) 2022-11-23T03:12:18.6384939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6385133Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6385505Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6385604Z return func(*args, **kwargs) 2022-11-23T03:12:18.6386002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6386133Z p_assert( 2022-11-23T03:12:18.6386551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6386661Z traceback.print_stack() 2022-11-23T03:12:18.6386815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:12:18.6387048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:12:18.6387288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:12:18.6387581Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:12:18.6387923Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.6388052Z File "", line 1, in 2022-11-23T03:12:18.6388243Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6388387Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6388583Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6388787Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6388965Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6389027Z self.run() 2022-11-23T03:12:18.6389226Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6389370Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6389698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6389856Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6390176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6390296Z getattr(self, test_name)() 2022-11-23T03:12:18.6390648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6390748Z fn() 2022-11-23T03:12:18.6391105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6391224Z test(self, **param_kwargs) 2022-11-23T03:12:18.6391613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6391687Z return func(*args, **kwargs) 2022-11-23T03:12:18.6392020Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6392144Z self.run_subtests( 2022-11-23T03:12:18.6392491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6392653Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6393045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6393151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6393506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6393617Z output = model(*input) 2022-11-23T03:12:18.6393940Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6394156Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6394500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6394676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6395035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6395148Z _lazy_init(state, module) 2022-11-23T03:12:18.6395478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6395616Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6395944Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6396067Z return func(*args, **kwargs) 2022-11-23T03:12:18.6396444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6396548Z p_assert( 2022-11-23T03:12:18.6396876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6396993Z traceback.print_stack() 2022-11-23T03:12:18.6397371Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.6397754Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.6398136Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:12:18.6398255Z File "", line 1, in 2022-11-23T03:12:18.6398460Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6398600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6398796Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6399029Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6399138Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6399242Z self.run() 2022-11-23T03:12:18.6399438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6399580Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6399912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6400039Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6400386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6400504Z getattr(self, test_name)() 2022-11-23T03:12:18.6400895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6400995Z fn() 2022-11-23T03:12:18.6401354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6401474Z test(self, **param_kwargs) 2022-11-23T03:12:18.6401828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6401949Z return func(*args, **kwargs) 2022-11-23T03:12:18.6402225Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6402320Z self.run_subtests( 2022-11-23T03:12:18.6402670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6402826Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6403243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6403394Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6403758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6403870Z output = model(*input) 2022-11-23T03:12:18.6404191Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6404326Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6404681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6404848Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6405208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6405324Z _lazy_init(state, module) 2022-11-23T03:12:18.6405666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6405799Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6406122Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6406239Z return func(*args, **kwargs) 2022-11-23T03:12:18.6406598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6406695Z p_assert( 2022-11-23T03:12:18.6407021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6407137Z traceback.print_stack() 2022-11-23T03:12:18.6407264Z File "", line 1, in 2022-11-23T03:12:18.6407462Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6407601Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6407791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6407939Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6408145Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6408244Z self.run() 2022-11-23T03:12:18.6408442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6408582Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6408922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6409054Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6409396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6409517Z getattr(self, test_name)() 2022-11-23T03:12:18.6409921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6410018Z fn() 2022-11-23T03:12:18.6410370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6410483Z test(self, **param_kwargs) 2022-11-23T03:12:18.6410848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6410950Z return func(*args, **kwargs) 2022-11-23T03:12:18.6411208Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6411314Z self.run_subtests( 2022-11-23T03:12:18.6411667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6411820Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6412235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6412389Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6412759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6412866Z output = model(*input) 2022-11-23T03:12:18.6413173Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6413304Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6413668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6413839Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6414292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6414383Z _lazy_init(state, module) 2022-11-23T03:12:18.6414680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6414819Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6415138Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6415259Z return func(*args, **kwargs) 2022-11-23T03:12:18.6415631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6415726Z p_assert( 2022-11-23T03:12:18.6416054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6416266Z traceback.print_stack() 2022-11-23T03:12:18.6416301Z File "", line 1, in 2022-11-23T03:12:18.6416506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6416718Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6416915Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6416980Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6417182Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6417276Z self.run() 2022-11-23T03:12:18.6417470Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6417611Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6417929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6418055Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6418409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6418530Z getattr(self, test_name)() 2022-11-23T03:12:18.6418927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6419026Z fn() 2022-11-23T03:12:18.6419463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6419505Z test(self, **param_kwargs) 2022-11-23T03:12:18.6419839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6419951Z return func(*args, **kwargs) 2022-11-23T03:12:18.6420222Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6420329Z self.run_subtests( 2022-11-23T03:12:18.6420673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6420879Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6421244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6421392Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6421747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6421866Z output = model(*input) 2022-11-23T03:12:18.6422184Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6422324Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6422700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6422865Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6423222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6423347Z _lazy_init(state, module) 2022-11-23T03:12:18.6423679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6423818Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6424526Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6424664Z return func(*args, **kwargs) 2022-11-23T03:12:18.6425008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6425142Z p_assert( 2022-11-23T03:12:18.6425456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6425580Z traceback.print_stack() 2022-11-23T03:12:18.6425833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:12:18.6426040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:12:18.6426301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:12:18.6426532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:12:18.6426940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.6427067Z File "", line 1, in 2022-11-23T03:12:18.6427267Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6427395Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6427589Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6427639Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6427925Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6428029Z self.run() 2022-11-23T03:12:18.6428227Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6428368Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6428701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6428830Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6429171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6429295Z getattr(self, test_name)() 2022-11-23T03:12:18.6429647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6429738Z fn() 2022-11-23T03:12:18.6430097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6430308Z test(self, **param_kwargs) 2022-11-23T03:12:18.6430662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6430780Z return func(*args, **kwargs) 2022-11-23T03:12:18.6431037Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6431145Z self.run_subtests( 2022-11-23T03:12:18.6431493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6431718Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6432017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6432151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6432533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6432643Z output = model(*input) 2022-11-23T03:12:18.6432948Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6433087Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6433459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6433626Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6433983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6434101Z _lazy_init(state, module) 2022-11-23T03:12:18.6434446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6434590Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6434910Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6435126Z return func(*args, **kwargs) 2022-11-23T03:12:18.6435410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6435508Z p_assert( 2022-11-23T03:12:18.6435838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6435958Z traceback.print_stack() 2022-11-23T03:12:18.6436347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.6436735Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.6437101Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:12:18.6437282Z File "", line 1, in 2022-11-23T03:12:18.6437499Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6437637Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6437830Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6437974Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6438184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6438280Z self.run() 2022-11-23T03:12:18.6438465Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6438605Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6438937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6439114Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6439473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6439595Z getattr(self, test_name)() 2022-11-23T03:12:18.6439950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6440042Z fn() 2022-11-23T03:12:18.6440387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6440552Z test(self, **param_kwargs) 2022-11-23T03:12:18.6440855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6441050Z return func(*args, **kwargs) 2022-11-23T03:12:18.6441245Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6441358Z self.run_subtests( 2022-11-23T03:12:18.6441702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6441862Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6442207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6442354Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6442481Z File "", line 1, in 2022-11-23T03:12:18.6442913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6443036Z output = model(*input) 2022-11-23T03:12:18.6443352Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6443488Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6443758Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6443889Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6444261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6444436Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6444638Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6444779Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6445142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6445264Z _lazy_init(state, module) 2022-11-23T03:12:18.6445471Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6445557Z self.run() 2022-11-23T03:12:18.6445899Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6446091Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6446296Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6446438Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6446770Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6446886Z return func(*args, **kwargs) 2022-11-23T03:12:18.6447205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6447335Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6447707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6447800Z p_assert( 2022-11-23T03:12:18.6448157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6448323Z getattr(self, test_name)() 2022-11-23T03:12:18.6448660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6448780Z traceback.print_stack() 2022-11-23T03:12:18.6449117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6449209Z fn() 2022-11-23T03:12:18.6449568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6449683Z test(self, **param_kwargs) 2022-11-23T03:12:18.6450031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6450204Z return func(*args, **kwargs) 2022-11-23T03:12:18.6450424Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6450539Z self.run_subtests( 2022-11-23T03:12:18.6450877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6451035Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6451396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6451542Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6451910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6452019Z output = model(*input) 2022-11-23T03:12:18.6452334Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6452468Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6452825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6453006Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6453369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6453482Z _lazy_init(state, module) 2022-11-23T03:12:18.6453830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6453962Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6454289Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6454404Z return func(*args, **kwargs) 2022-11-23T03:12:18.6454860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6454863Z p_assert( 2022-11-23T03:12:18.6455270Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6455456Z traceback.print_stack() 2022-11-23T03:12:18.6455572Z File "", line 1, in 2022-11-23T03:12:18.6455767Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6455904Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6456096Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6456229Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6456440Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6456534Z self.run() 2022-11-23T03:12:18.6456735Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6456876Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6457212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6457390Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6457735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6457854Z getattr(self, test_name)() 2022-11-23T03:12:18.6458212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6458304Z fn() 2022-11-23T03:12:18.6458659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6458770Z test(self, **param_kwargs) 2022-11-23T03:12:18.6459113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6459236Z return func(*args, **kwargs) 2022-11-23T03:12:18.6459494Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6459609Z self.run_subtests( 2022-11-23T03:12:18.6459962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6460118Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6460476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6460622Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6460985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6461096Z output = model(*input) 2022-11-23T03:12:18.6461401Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6461534Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6461904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6462088Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6462453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6462565Z _lazy_init(state, module) 2022-11-23T03:12:18.6462911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6463047Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6463460Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6463481Z return func(*args, **kwargs) 2022-11-23T03:12:18.6464320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6464438Z p_assert( 2022-11-23T03:12:18.6464878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6465008Z traceback.print_stack() 2022-11-23T03:12:18.6465221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:12:18.6465468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:12:18.6465699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:12:18.6465925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:12:18.6466243Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.6466366Z File "", line 1, in 2022-11-23T03:12:18.6466576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6466776Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6466973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6467114Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6467321Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6467407Z self.run() 2022-11-23T03:12:18.6467602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6467739Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6468072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6468196Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6468551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6468666Z getattr(self, test_name)() 2022-11-23T03:12:18.6469009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6469100Z fn() 2022-11-23T03:12:18.6469458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6469573Z test(self, **param_kwargs) 2022-11-23T03:12:18.6469922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6470046Z return func(*args, **kwargs) 2022-11-23T03:12:18.6470316Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6470431Z self.run_subtests( 2022-11-23T03:12:18.6470812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6470973Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6471332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6471475Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6471848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6471961Z output = model(*input) 2022-11-23T03:12:18.6472278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6472460Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6472767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6472933Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6473290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6473410Z _lazy_init(state, module) 2022-11-23T03:12:18.6473816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6473960Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6474290Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6474407Z return func(*args, **kwargs) 2022-11-23T03:12:18.6474766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6474866Z p_assert( 2022-11-23T03:12:18.6475196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6475404Z traceback.print_stack() 2022-11-23T03:12:18.6475703Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.6476146Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.6476521Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:12:18.6476642Z File "", line 1, in 2022-11-23T03:12:18.6476852Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6476979Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6477175Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6477316Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6477523Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6477618Z self.run() 2022-11-23T03:12:18.6477813Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6477961Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6478075Z File "", line 1, in 2022-11-23T03:12:18.6478413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6478540Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6478896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6479058Z getattr(self, test_name)() 2022-11-23T03:12:18.6479218Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6479352Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6479704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6479786Z fn() 2022-11-23T03:12:18.6479983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6480134Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6480498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6480620Z test(self, **param_kwargs) 2022-11-23T03:12:18.6480828Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6480926Z self.run() 2022-11-23T03:12:18.6481275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6481382Z return func(*args, **kwargs) 2022-11-23T03:12:18.6481578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6481719Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6481987Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6482096Z self.run_subtests( 2022-11-23T03:12:18.6482479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6482613Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6483048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6483110Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6483462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6483580Z getattr(self, test_name)() 2022-11-23T03:12:18.6483933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6484075Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6484428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6484562Z fn() 2022-11-23T03:12:18.6484923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6485037Z output = model(*input) 2022-11-23T03:12:18.6485393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6485505Z test(self, **param_kwargs) 2022-11-23T03:12:18.6485819Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6485954Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6486303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6486417Z return func(*args, **kwargs) 2022-11-23T03:12:18.6486774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6486950Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6487219Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6487327Z self.run_subtests( 2022-11-23T03:12:18.6487683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6487801Z _lazy_init(state, module) 2022-11-23T03:12:18.6488142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6488297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6488638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6488763Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6489126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6489348Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6489599Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6489713Z return func(*args, **kwargs) 2022-11-23T03:12:18.6490076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6490261Z output = model(*input) 2022-11-23T03:12:18.6490556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6490641Z p_assert( 2022-11-23T03:12:18.6490960Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6491092Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6491467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6491597Z traceback.print_stack() 2022-11-23T03:12:18.6491962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6492130Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6492486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6492588Z _lazy_init(state, module) 2022-11-23T03:12:18.6492931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6493069Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6493404Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6493520Z return func(*args, **kwargs) 2022-11-23T03:12:18.6493957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6494054Z p_assert( 2022-11-23T03:12:18.6494384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6494493Z traceback.print_stack() 2022-11-23T03:12:18.6494616Z File "", line 1, in 2022-11-23T03:12:18.6494821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6494995Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6495152Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6495297Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6495501Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6495587Z self.run() 2022-11-23T03:12:18.6495787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6495931Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6496264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6496391Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6496743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6496858Z getattr(self, test_name)() 2022-11-23T03:12:18.6497210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6497290Z fn() 2022-11-23T03:12:18.6497649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6497764Z test(self, **param_kwargs) 2022-11-23T03:12:18.6498118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6498244Z return func(*args, **kwargs) 2022-11-23T03:12:18.6498511Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6498620Z self.run_subtests( 2022-11-23T03:12:18.6498963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6499107Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6499460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6499605Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6499970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6500082Z output = model(*input) 2022-11-23T03:12:18.6500500Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6500646Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6501020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6501178Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6501535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6501650Z _lazy_init(state, module) 2022-11-23T03:12:18.6502001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6502145Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6502479Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6502657Z return func(*args, **kwargs) 2022-11-23T03:12:18.6503039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6503123Z p_assert( 2022-11-23T03:12:18.6503451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6503578Z traceback.print_stack() 2022-11-23T03:12:18.6503823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T03:12:18.6504335Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 3 2022-11-23T03:12:18.6504636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 2 2022-11-23T03:12:18.6505071Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.6505285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T03:12:18.6505668Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.6505745Z File "", line 1, in 2022-11-23T03:12:18.6505986Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6506159Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6506348Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6506477Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6506701Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6506807Z self.run() 2022-11-23T03:12:18.6507009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6507166Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6507507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6507651Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6507981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6508056Z getattr(self, test_name)() 2022-11-23T03:12:18.6508379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6508472Z fn() 2022-11-23T03:12:18.6508817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6508936Z test(self, **param_kwargs) 2022-11-23T03:12:18.6509291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6509409Z return func(*args, **kwargs) 2022-11-23T03:12:18.6509766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6509881Z self.run_subtests( 2022-11-23T03:12:18.6510230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6510385Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6510730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6510871Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6511239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6511369Z output = model(*input) 2022-11-23T03:12:18.6511668Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6511869Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6512243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6512419Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6512843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6512877Z _lazy_init(state, module) 2022-11-23T03:12:18.6513219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6513352Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6513787Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6513901Z return func(*args, **kwargs) 2022-11-23T03:12:18.6514172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6514269Z p_assert( 2022-11-23T03:12:18.6514615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6514713Z traceback.print_stack() 2022-11-23T03:12:18.6514842Z File "", line 1, in 2022-11-23T03:12:18.6515048Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6515190Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6515391Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6515536Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6515730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6515828Z self.run() 2022-11-23T03:12:18.6516024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6516172Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6516519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6516648Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6517005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6517120Z getattr(self, test_name)() 2022-11-23T03:12:18.6517460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6517548Z fn() 2022-11-23T03:12:18.6517901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6518018Z test(self, **param_kwargs) 2022-11-23T03:12:18.6518363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6518486Z return func(*args, **kwargs) 2022-11-23T03:12:18.6518805Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6518924Z self.run_subtests( 2022-11-23T03:12:18.6519262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6519420Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6519858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6519927Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6520293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6520403Z output = model(*input) 2022-11-23T03:12:18.6520719Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6520902Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6521263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6521434Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6521797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6521907Z _lazy_init(state, module) 2022-11-23T03:12:18.6522253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6522392Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6522723Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6522841Z return func(*args, **kwargs) 2022-11-23T03:12:18.6523200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6523304Z p_assert( 2022-11-23T03:12:18.6523636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6523758Z traceback.print_stack() 2022-11-23T03:12:18.6524155Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.6524548Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:7 with 4 nodes. 2022-11-23T03:12:18.6524672Z File "", line 1, in 2022-11-23T03:12:18.6524972Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6525058Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6525266Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6525349Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6525654Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6525741Z self.run() 2022-11-23T03:12:18.6525942Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6525998Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6526417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6526544Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6526809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6526936Z getattr(self, test_name)() 2022-11-23T03:12:18.6527288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6527380Z fn() 2022-11-23T03:12:18.6527739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6527914Z test(self, **param_kwargs) 2022-11-23T03:12:18.6528278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6528386Z return func(*args, **kwargs) 2022-11-23T03:12:18.6528653Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6528766Z self.run_subtests( 2022-11-23T03:12:18.6529113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6529274Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6529637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6529784Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6530214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6530316Z output = model(*input) 2022-11-23T03:12:18.6530632Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6530764Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6531130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6531300Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6531662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6531776Z _lazy_init(state, module) 2022-11-23T03:12:18.6532171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6532381Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6532638Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6532754Z return func(*args, **kwargs) 2022-11-23T03:12:18.6533129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6533229Z p_assert( 2022-11-23T03:12:18.6533560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6533683Z traceback.print_stack() 2022-11-23T03:12:18.6533809Z File "", line 1, in 2022-11-23T03:12:18.6534001Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6534136Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6534332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6534477Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6534687Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6534792Z self.run() 2022-11-23T03:12:18.6535055Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6535119Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6535457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6535587Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6535937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6536053Z getattr(self, test_name)() 2022-11-23T03:12:18.6536403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6536495Z fn() 2022-11-23T03:12:18.6536908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6537020Z test(self, **param_kwargs) 2022-11-23T03:12:18.6537371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6537491Z return func(*args, **kwargs) 2022-11-23T03:12:18.6537760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6537872Z self.run_subtests( 2022-11-23T03:12:18.6538223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6538380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6538740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6538922Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6539297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6539417Z output = model(*input) 2022-11-23T03:12:18.6539734Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6539891Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6540237Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6540408Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6540775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6540879Z _lazy_init(state, module) 2022-11-23T03:12:18.6541226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6541466Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6541711Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6541830Z return func(*args, **kwargs) 2022-11-23T03:12:18.6542206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6542304Z p_assert( 2022-11-23T03:12:18.6542692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6542802Z traceback.print_stack() 2022-11-23T03:12:18.6543044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T03:12:18.6543275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 3 2022-11-23T03:12:18.6543506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 2 2022-11-23T03:12:18.6543746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T03:12:18.6544576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.6544698Z File "", line 1, in 2022-11-23T03:12:18.6544900Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6545055Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6545249Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6545392Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6545600Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6545693Z self.run() 2022-11-23T03:12:18.6545897Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6546026Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6546377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6546502Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6546860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6546977Z getattr(self, test_name)() 2022-11-23T03:12:18.6547325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6547414Z fn() 2022-11-23T03:12:18.6547776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6547895Z test(self, **param_kwargs) 2022-11-23T03:12:18.6548245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6548415Z return func(*args, **kwargs) 2022-11-23T03:12:18.6548689Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6548793Z self.run_subtests( 2022-11-23T03:12:18.6549147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6549308Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6549662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6549808Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6550178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6550278Z output = model(*input) 2022-11-23T03:12:18.6550600Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6550747Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6551122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6551293Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6551657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6551779Z _lazy_init(state, module) 2022-11-23T03:12:18.6552128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6552252Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6552576Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6552695Z return func(*args, **kwargs) 2022-11-23T03:12:18.6553072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6553169Z p_assert( 2022-11-23T03:12:18.6553499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6553614Z traceback.print_stack() 2022-11-23T03:12:18.6554007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.6554379Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.6554756Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:8 with 4 nodes. 2022-11-23T03:12:18.6554878Z File "", line 1, in 2022-11-23T03:12:18.6555089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6555229Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6555479Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6555634Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6555841Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6555926Z self.run() 2022-11-23T03:12:18.6556127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6556268Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6556602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6556731Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6557086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6557202Z getattr(self, test_name)() 2022-11-23T03:12:18.6557631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6557710Z fn() 2022-11-23T03:12:18.6558069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6558189Z test(self, **param_kwargs) 2022-11-23T03:12:18.6558537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6558656Z return func(*args, **kwargs) 2022-11-23T03:12:18.6558925Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6559037Z self.run_subtests( 2022-11-23T03:12:18.6559388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6559532Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6559980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6560042Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6560417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6560538Z output = model(*input) 2022-11-23T03:12:18.6560866Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6561007Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6561136Z File "", line 1, in 2022-11-23T03:12:18.6561491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6561665Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6562027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6562149Z _lazy_init(state, module) 2022-11-23T03:12:18.6562359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6562500Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6562843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6562979Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6563163Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6563303Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6563633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6563752Z return func(*args, **kwargs) 2022-11-23T03:12:18.6563964Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6564068Z self.run() 2022-11-23T03:12:18.6564486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6564589Z p_assert( 2022-11-23T03:12:18.6564774Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6564917Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6565246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6565367Z traceback.print_stack() 2022-11-23T03:12:18.6565702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6565834Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6566186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6566342Z getattr(self, test_name)() 2022-11-23T03:12:18.6566698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6566791Z fn() 2022-11-23T03:12:18.6567153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6567267Z test(self, **param_kwargs) 2022-11-23T03:12:18.6567613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6567730Z return func(*args, **kwargs) 2022-11-23T03:12:18.6568003Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6568099Z self.run_subtests( 2022-11-23T03:12:18.6568445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6568602Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6568962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6569111Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6569479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6569590Z output = model(*input) 2022-11-23T03:12:18.6569905Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6570027Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6570394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6570611Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6570984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6571112Z _lazy_init(state, module) 2022-11-23T03:12:18.6571464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6571610Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6571944Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6572050Z return func(*args, **kwargs) 2022-11-23T03:12:18.6572417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6572512Z p_assert( 2022-11-23T03:12:18.6572845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6572960Z traceback.print_stack() 2022-11-23T03:12:18.6573079Z File "", line 1, in 2022-11-23T03:12:18.6573287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6573473Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6573664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6573814Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6574051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6574156Z self.run() 2022-11-23T03:12:18.6574354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6574489Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6574822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6574951Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6575292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6575538Z getattr(self, test_name)() 2022-11-23T03:12:18.6575818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6575946Z fn() 2022-11-23T03:12:18.6576269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6576384Z test(self, **param_kwargs) 2022-11-23T03:12:18.6576735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6576839Z return func(*args, **kwargs) 2022-11-23T03:12:18.6577112Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6577225Z self.run_subtests( 2022-11-23T03:12:18.6577576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6577734Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6578094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6578242Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6578612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6578728Z output = model(*input) 2022-11-23T03:12:18.6579036Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6579178Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6579551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6579722Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6580086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6580215Z _lazy_init(state, module) 2022-11-23T03:12:18.6580557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6580690Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6581007Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6581132Z return func(*args, **kwargs) 2022-11-23T03:12:18.6581503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6581603Z p_assert( 2022-11-23T03:12:18.6581936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6582057Z traceback.print_stack() 2022-11-23T03:12:18.6582297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T03:12:18.6582596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 3 2022-11-23T03:12:18.6582828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 2 2022-11-23T03:12:18.6583068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T03:12:18.6583483Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6584222Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6584284Z File "", line 1, in 2022-11-23T03:12:18.6584575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6584707Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6584996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6585125Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6585352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6585450Z self.run() 2022-11-23T03:12:18.6585656Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6585788Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6586041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6586165Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6586516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6586622Z getattr(self, test_name)() 2022-11-23T03:12:18.6586984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6587084Z fn() 2022-11-23T03:12:18.6587446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6587566Z test(self, **param_kwargs) 2022-11-23T03:12:18.6587913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6588029Z return func(*args, **kwargs) 2022-11-23T03:12:18.6588301Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6588505Z self.run_subtests( 2022-11-23T03:12:18.6588944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6589107Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6589463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6589687Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6589988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6590099Z output = model(*input) 2022-11-23T03:12:18.6590418Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6590639Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6590915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6591087Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6591452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6591563Z _lazy_init(state, module) 2022-11-23T03:12:18.6591969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6592114Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6592444Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6592551Z return func(*args, **kwargs) 2022-11-23T03:12:18.6592929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6593023Z p_assert( 2022-11-23T03:12:18.6593349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6593466Z traceback.print_stack() 2022-11-23T03:12:18.6593588Z File "", line 1, in 2022-11-23T03:12:18.6593791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6593963Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6594169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6594315Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6594522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6594624Z self.run() 2022-11-23T03:12:18.6594822Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6594962Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6595361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6595418Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6595772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6595891Z getattr(self, test_name)() 2022-11-23T03:12:18.6596241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6596427Z fn() 2022-11-23T03:12:18.6596710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6596828Z test(self, **param_kwargs) 2022-11-23T03:12:18.6597181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6597287Z return func(*args, **kwargs) 2022-11-23T03:12:18.6597559Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6597665Z self.run_subtests( 2022-11-23T03:12:18.6598010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6598275Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6598642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6598796Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6599162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6599264Z output = model(*input) 2022-11-23T03:12:18.6599583Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6599721Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6600088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6600265Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6600620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6600741Z _lazy_init(state, module) 2022-11-23T03:12:18.6601131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6601262Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6601697Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6601716Z return func(*args, **kwargs) 2022-11-23T03:12:18.6602096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6602196Z p_assert( 2022-11-23T03:12:18.6602527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6602645Z traceback.print_stack() 2022-11-23T03:12:18.6603035Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6603465Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:9 with 4 nodes. 2022-11-23T03:12:18.6603595Z File "", line 1, in 2022-11-23T03:12:18.6603800Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6604012Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6604132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6604280Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6604488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6604587Z self.run() 2022-11-23T03:12:18.6604772Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6604918Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6605259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6605392Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6605753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6605876Z getattr(self, test_name)() 2022-11-23T03:12:18.6606232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6606322Z fn() 2022-11-23T03:12:18.6606667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6606782Z test(self, **param_kwargs) 2022-11-23T03:12:18.6607130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6607252Z return func(*args, **kwargs) 2022-11-23T03:12:18.6607527Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6607639Z self.run_subtests( 2022-11-23T03:12:18.6607985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6608242Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6608492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6608640Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6609004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6609120Z output = model(*input) 2022-11-23T03:12:18.6609444Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6609582Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6609963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6610185Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6610546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6610667Z _lazy_init(state, module) 2022-11-23T03:12:18.6610793Z File "", line 1, in 2022-11-23T03:12:18.6611140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6611274Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6611601Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6611760Z return func(*args, **kwargs) 2022-11-23T03:12:18.6611920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6612045Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6612475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6612568Z p_assert( 2022-11-23T03:12:18.6612770Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6612978Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6613248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6613367Z traceback.print_stack() 2022-11-23T03:12:18.6613560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6613659Z self.run() 2022-11-23T03:12:18.6613851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6613993Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6614325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6614456Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6614811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6614928Z getattr(self, test_name)() 2022-11-23T03:12:18.6615265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6615360Z fn() 2022-11-23T03:12:18.6615719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6615837Z test(self, **param_kwargs) 2022-11-23T03:12:18.6616188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6616302Z return func(*args, **kwargs) 2022-11-23T03:12:18.6616575Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6616692Z self.run_subtests( 2022-11-23T03:12:18.6617029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6617183Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6617540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6617687Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6618053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6618166Z output = model(*input) 2022-11-23T03:12:18.6618485Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6618619Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6619023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6619197Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6619558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6619672Z _lazy_init(state, module) 2022-11-23T03:12:18.6620019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6620186Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6620489Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6620608Z return func(*args, **kwargs) 2022-11-23T03:12:18.6621022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6621185Z p_assert( 2022-11-23T03:12:18.6621528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6621651Z traceback.print_stack() 2022-11-23T03:12:18.6621895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T03:12:18.6622134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 3 2022-11-23T03:12:18.6622360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 2 2022-11-23T03:12:18.6622584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T03:12:18.6622960Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6623088Z File "", line 1, in 2022-11-23T03:12:18.6623292Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6623430Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6623630Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6623775Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6624232Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6624340Z self.run() 2022-11-23T03:12:18.6624609Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6624769Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6625101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6625233Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6625594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6625702Z getattr(self, test_name)() 2022-11-23T03:12:18.6625977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6626071Z fn() 2022-11-23T03:12:18.6626418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6626553Z test(self, **param_kwargs) 2022-11-23T03:12:18.6626895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6627011Z return func(*args, **kwargs) 2022-11-23T03:12:18.6627284Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6627388Z self.run_subtests( 2022-11-23T03:12:18.6627729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6627946Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6628300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6628452Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6628818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6628933Z output = model(*input) 2022-11-23T03:12:18.6629248Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6629382Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6629751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6629920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6630265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6630449Z _lazy_init(state, module) 2022-11-23T03:12:18.6630799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6630931Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6631258Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6631376Z return func(*args, **kwargs) 2022-11-23T03:12:18.6631753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6631847Z p_assert( 2022-11-23T03:12:18.6632162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6632281Z traceback.print_stack() 2022-11-23T03:12:18.6632756Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6633067Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6633458Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:10 with 4 nodes. 2022-11-23T03:12:18.6633583Z File "", line 1, in 2022-11-23T03:12:18.6633786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6633921Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6634105Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6634248Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6634460Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6634560Z self.run() 2022-11-23T03:12:18.6634763Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6634911Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6635276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6635374Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6635713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6635832Z getattr(self, test_name)() 2022-11-23T03:12:18.6636184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6636270Z fn() 2022-11-23T03:12:18.6636631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6636746Z test(self, **param_kwargs) 2022-11-23T03:12:18.6637094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6637217Z return func(*args, **kwargs) 2022-11-23T03:12:18.6637523Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6637731Z self.run_subtests( 2022-11-23T03:12:18.6638005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6638231Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6638590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6638653Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6639021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6639131Z output = model(*input) 2022-11-23T03:12:18.6639438Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6639658Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6640030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6640288Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6640564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6640676Z _lazy_init(state, module) 2022-11-23T03:12:18.6641013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6641152Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6641469Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6641597Z return func(*args, **kwargs) 2022-11-23T03:12:18.6641979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6642071Z p_assert( 2022-11-23T03:12:18.6642400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6642522Z traceback.print_stack() 2022-11-23T03:12:18.6642702Z File "", line 1, in 2022-11-23T03:12:18.6642910Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6643035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6643231Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6643380Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6643581Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6643675Z self.run() 2022-11-23T03:12:18.6643881Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6644025Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6644347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6644474Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6644832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6644953Z getattr(self, test_name)() 2022-11-23T03:12:18.6645306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6645402Z fn() 2022-11-23T03:12:18.6645758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6645881Z test(self, **param_kwargs) 2022-11-23T03:12:18.6646218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6646392Z return func(*args, **kwargs) 2022-11-23T03:12:18.6646706Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6646873Z self.run_subtests( 2022-11-23T03:12:18.6647134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6647293Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6647653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6647799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6648153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6648274Z output = model(*input) 2022-11-23T03:12:18.6648676Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6648816Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6649183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6649355Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6649807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6649831Z _lazy_init(state, module) 2022-11-23T03:12:18.6650160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6650400Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6650630Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6650817Z return func(*args, **kwargs) 2022-11-23T03:12:18.6651211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6651311Z p_assert( 2022-11-23T03:12:18.6651599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6651678Z traceback.print_stack() 2022-11-23T03:12:18.6651789Z File "", line 1, in 2022-11-23T03:12:18.6651993Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6652130Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6652329Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6652474Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6652685Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6652794Z self.run() 2022-11-23T03:12:18.6652979Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6653122Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6653457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6653584Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6653936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6654055Z getattr(self, test_name)() 2022-11-23T03:12:18.6654407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6654496Z fn() 2022-11-23T03:12:18.6654838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6654960Z test(self, **param_kwargs) 2022-11-23T03:12:18.6655365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6655580Z return func(*args, **kwargs) 2022-11-23T03:12:18.6655764Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6655871Z self.run_subtests( 2022-11-23T03:12:18.6656218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6656374Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6656714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6656862Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6657234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6657393Z output = model(*input) 2022-11-23T03:12:18.6657721Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6657855Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6658224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6658397Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6658744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6658861Z _lazy_init(state, module) 2022-11-23T03:12:18.6659202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6659336Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6659666Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6659792Z return func(*args, **kwargs) 2022-11-23T03:12:18.6660171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6660272Z p_assert( 2022-11-23T03:12:18.6660590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6660708Z traceback.print_stack() 2022-11-23T03:12:18.6660952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T03:12:18.6661184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 3 2022-11-23T03:12:18.6661410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 2 2022-11-23T03:12:18.6661635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T03:12:18.6662035Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6662164Z File "", line 1, in 2022-11-23T03:12:18.6662370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6662498Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6662697Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6662844Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6663059Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6663154Z self.run() 2022-11-23T03:12:18.6663354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6663500Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6663821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6664208Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6664732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6664861Z getattr(self, test_name)() 2022-11-23T03:12:18.6665227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6665324Z fn() 2022-11-23T03:12:18.6665674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6665800Z test(self, **param_kwargs) 2022-11-23T03:12:18.6666099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6666152Z return func(*args, **kwargs) 2022-11-23T03:12:18.6666426Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6666593Z self.run_subtests( 2022-11-23T03:12:18.6666943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6667101Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6667456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6667601Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6667954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6668065Z output = model(*input) 2022-11-23T03:12:18.6668387Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6668523Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6668900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6669084Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6669448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6669563Z _lazy_init(state, module) 2022-11-23T03:12:18.6669892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6670027Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6670357Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6670473Z return func(*args, **kwargs) 2022-11-23T03:12:18.6670908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6671008Z p_assert( 2022-11-23T03:12:18.6671340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6671467Z traceback.print_stack() 2022-11-23T03:12:18.6671845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6672234Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6672621Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:11 with 4 nodes. 2022-11-23T03:12:18.6672741Z File "", line 1, in 2022-11-23T03:12:18.6672948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6673159Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6673288Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6673491Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6673679Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6673790Z self.run() 2022-11-23T03:12:18.6673991Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6674133Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6674465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6674590Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6674944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6675060Z getattr(self, test_name)() 2022-11-23T03:12:18.6675399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6675492Z fn() 2022-11-23T03:12:18.6675848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6676111Z test(self, **param_kwargs) 2022-11-23T03:12:18.6676408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6676485Z return func(*args, **kwargs) 2022-11-23T03:12:18.6676756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6676861Z self.run_subtests( 2022-11-23T03:12:18.6677195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6677356Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6677716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6677864Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6678245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6678361Z output = model(*input) 2022-11-23T03:12:18.6678681Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6678816Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6679171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6679343Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6679709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6679821Z _lazy_init(state, module) 2022-11-23T03:12:18.6680166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6680313Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6680649Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6680767Z return func(*args, **kwargs) 2022-11-23T03:12:18.6681125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6681224Z p_assert( 2022-11-23T03:12:18.6681555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6681678Z traceback.print_stack() 2022-11-23T03:12:18.6681804Z File "", line 1, in 2022-11-23T03:12:18.6682009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6682210Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6682406Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6682543Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6682803Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6682913Z self.run() 2022-11-23T03:12:18.6683108Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6683248Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6683585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6683715Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6684059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6684175Z getattr(self, test_name)() 2022-11-23T03:12:18.6684523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6684619Z fn() 2022-11-23T03:12:18.6685053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6685172Z test(self, **param_kwargs) 2022-11-23T03:12:18.6685522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6685641Z return func(*args, **kwargs) 2022-11-23T03:12:18.6685902Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6686014Z self.run_subtests( 2022-11-23T03:12:18.6686362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6686516Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6686868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6687023Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6687391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6687511Z output = model(*input) 2022-11-23T03:12:18.6687820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6687953Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6688326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6688497Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6688854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6688968Z _lazy_init(state, module) 2022-11-23T03:12:18.6689314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6689455Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6689781Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6689887Z return func(*args, **kwargs) 2022-11-23T03:12:18.6690260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6690355Z p_assert( 2022-11-23T03:12:18.6690680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6690796Z traceback.print_stack() 2022-11-23T03:12:18.6690974Z File "", line 1, in 2022-11-23T03:12:18.6691122Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6691245Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6691445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6691641Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6691864Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6691970Z self.run() 2022-11-23T03:12:18.6692172Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6692311Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6692646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6692764Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6693122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6693244Z getattr(self, test_name)() 2022-11-23T03:12:18.6693595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6693736Z fn() 2022-11-23T03:12:18.6694101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6694215Z test(self, **param_kwargs) 2022-11-23T03:12:18.6694562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6694668Z return func(*args, **kwargs) 2022-11-23T03:12:18.6694945Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6695050Z self.run_subtests( 2022-11-23T03:12:18.6695395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6695554Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6695910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6696058Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6696433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6696534Z output = model(*input) 2022-11-23T03:12:18.6696852Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6696985Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6697355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6697529Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6697884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6697997Z _lazy_init(state, module) 2022-11-23T03:12:18.6698341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6698472Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6698798Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6698915Z return func(*args, **kwargs) 2022-11-23T03:12:18.6699287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6699468Z p_assert( 2022-11-23T03:12:18.6699717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6699835Z traceback.print_stack() 2022-11-23T03:12:18.6700078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T03:12:18.6700298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 3 2022-11-23T03:12:18.6700530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 2 2022-11-23T03:12:18.6700799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T03:12:18.6701197Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6701321Z File "", line 1, in 2022-11-23T03:12:18.6701529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6701673Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6701868Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6702001Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6702208Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6702310Z self.run() 2022-11-23T03:12:18.6702511Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6702707Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6703038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6703159Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6703515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6703621Z getattr(self, test_name)() 2022-11-23T03:12:18.6704337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6704445Z fn() 2022-11-23T03:12:18.6704891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6705019Z test(self, **param_kwargs) 2022-11-23T03:12:18.6705370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6705508Z return func(*args, **kwargs) 2022-11-23T03:12:18.6705774Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6705868Z self.run_subtests( 2022-11-23T03:12:18.6706221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6706302Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6706666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6706821Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6707191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6707304Z output = model(*input) 2022-11-23T03:12:18.6707631Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6707756Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6708132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6708305Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6708666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6708788Z _lazy_init(state, module) 2022-11-23T03:12:18.6709133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6709270Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6709597Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6709706Z return func(*args, **kwargs) 2022-11-23T03:12:18.6710147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6710262Z p_assert( 2022-11-23T03:12:18.6710599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6710719Z traceback.print_stack() 2022-11-23T03:12:18.6711106Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6711501Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6711892Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:12 with 4 nodes. 2022-11-23T03:12:18.6712002Z File "", line 1, in 2022-11-23T03:12:18.6712208Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6712413Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6712609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6712750Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6712954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6713056Z self.run() 2022-11-23T03:12:18.6713250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6713432Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6713720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6713848Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6714215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6714337Z getattr(self, test_name)() 2022-11-23T03:12:18.6714690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6714787Z fn() 2022-11-23T03:12:18.6715133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6715251Z test(self, **param_kwargs) 2022-11-23T03:12:18.6715598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6715717Z return func(*args, **kwargs) 2022-11-23T03:12:18.6715986Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6716096Z self.run_subtests( 2022-11-23T03:12:18.6716438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6716603Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6716950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6717097Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6717466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6717584Z output = model(*input) 2022-11-23T03:12:18.6717901Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6718038Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6718413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6718580Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6718940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6719094Z _lazy_init(state, module) 2022-11-23T03:12:18.6719446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6719578Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6719907Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6720021Z return func(*args, **kwargs) 2022-11-23T03:12:18.6720387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6720480Z p_assert( 2022-11-23T03:12:18.6720867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6720923Z traceback.print_stack() 2022-11-23T03:12:18.6721049Z File "", line 1, in 2022-11-23T03:12:18.6721301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6721441Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6721641Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6721786Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6721990Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6722074Z self.run() 2022-11-23T03:12:18.6722271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6722409Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6722745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6722867Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6723219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6723343Z getattr(self, test_name)() 2022-11-23T03:12:18.6723698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6723778Z fn() 2022-11-23T03:12:18.6724134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6724250Z test(self, **param_kwargs) 2022-11-23T03:12:18.6724596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6724714Z return func(*args, **kwargs) 2022-11-23T03:12:18.6724985Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6725094Z self.run_subtests( 2022-11-23T03:12:18.6725435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6725585Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6725939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6726087Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6726457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6726573Z output = model(*input) 2022-11-23T03:12:18.6726914Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6727026Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6727392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6727548Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6727959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6728081Z _lazy_init(state, module) 2022-11-23T03:12:18.6728430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6728574Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6728903Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6729019Z return func(*args, **kwargs) 2022-11-23T03:12:18.6729387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6729471Z p_assert( 2022-11-23T03:12:18.6729805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6729924Z traceback.print_stack() 2022-11-23T03:12:18.6730048Z File "", line 1, in 2022-11-23T03:12:18.6730305Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6730446Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6730644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6730775Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6730985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6731084Z self.run() 2022-11-23T03:12:18.6731279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6731420Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6731756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6731882Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6732244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6732357Z getattr(self, test_name)() 2022-11-23T03:12:18.6732711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6732888Z fn() 2022-11-23T03:12:18.6733173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6733290Z test(self, **param_kwargs) 2022-11-23T03:12:18.6733640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6733756Z return func(*args, **kwargs) 2022-11-23T03:12:18.6734030Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6734124Z self.run_subtests( 2022-11-23T03:12:18.6734467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6734627Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6734981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6735125Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6735488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6735603Z output = model(*input) 2022-11-23T03:12:18.6735996Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6736118Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6736488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6736659Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6737067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6737187Z _lazy_init(state, module) 2022-11-23T03:12:18.6737530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6737665Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6737996Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6738101Z return func(*args, **kwargs) 2022-11-23T03:12:18.6738470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6738566Z p_assert( 2022-11-23T03:12:18.6738897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6739017Z traceback.print_stack() 2022-11-23T03:12:18.6739313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T03:12:18.6739546Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 3 2022-11-23T03:12:18.6739774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 2 2022-11-23T03:12:18.6739983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T03:12:18.6740374Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6740762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6740956Z File "", line 1, in 2022-11-23T03:12:18.6741097Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6741237Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6741436Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6741579Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6741774Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6741873Z self.run() 2022-11-23T03:12:18.6742066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6742282Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6742548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6742733Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6743091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6743205Z getattr(self, test_name)() 2022-11-23T03:12:18.6743545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6743648Z fn() 2022-11-23T03:12:18.6744258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6744388Z test(self, **param_kwargs) 2022-11-23T03:12:18.6744840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6744931Z return func(*args, **kwargs) 2022-11-23T03:12:18.6745297Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6745413Z self.run_subtests( 2022-11-23T03:12:18.6745746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6745899Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6746251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6746398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6746773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6746897Z output = model(*input) 2022-11-23T03:12:18.6747220Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6747352Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6747706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6747879Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6748237Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6748435Z _lazy_init(state, module) 2022-11-23T03:12:18.6748789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6748931Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6749260Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6749378Z return func(*args, **kwargs) 2022-11-23T03:12:18.6749739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6749839Z p_assert( 2022-11-23T03:12:18.6750165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6750284Z traceback.print_stack() 2022-11-23T03:12:18.6750408Z File "", line 1, in 2022-11-23T03:12:18.6750614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6750760Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6750964Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6751096Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6751303Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6751480Z self.run() 2022-11-23T03:12:18.6751595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6751747Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6752072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6752205Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6752547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6752672Z getattr(self, test_name)() 2022-11-23T03:12:18.6753039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6753135Z fn() 2022-11-23T03:12:18.6753497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6753620Z test(self, **param_kwargs) 2022-11-23T03:12:18.6753965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6754083Z return func(*args, **kwargs) 2022-11-23T03:12:18.6754341Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6754447Z self.run_subtests( 2022-11-23T03:12:18.6754788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6754941Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6755356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6755514Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6755955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6755997Z output = model(*input) 2022-11-23T03:12:18.6756304Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6756433Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6756800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6756969Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6757328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6757488Z _lazy_init(state, module) 2022-11-23T03:12:18.6757841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6757986Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6758318Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6758424Z return func(*args, **kwargs) 2022-11-23T03:12:18.6758795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6758894Z p_assert( 2022-11-23T03:12:18.6759230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6759352Z traceback.print_stack() 2022-11-23T03:12:18.6759737Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6760139Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:13 with 4 nodes. 2022-11-23T03:12:18.6760262Z File "", line 1, in 2022-11-23T03:12:18.6760455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6760588Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6760861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6760930Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6761139Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6761238Z self.run() 2022-11-23T03:12:18.6761437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6761563Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6761893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6762029Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6762383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6762504Z getattr(self, test_name)() 2022-11-23T03:12:18.6762855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6762953Z fn() 2022-11-23T03:12:18.6763306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6763411Z test(self, **param_kwargs) 2022-11-23T03:12:18.6763766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6763885Z return func(*args, **kwargs) 2022-11-23T03:12:18.6764161Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6764322Z self.run_subtests( 2022-11-23T03:12:18.6764674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6764834Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6765194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6765328Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6765697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6765810Z output = model(*input) 2022-11-23T03:12:18.6766127Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6766263Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6766785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6766854Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6767216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6767320Z _lazy_init(state, module) 2022-11-23T03:12:18.6767663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6767808Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6768141Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6768264Z return func(*args, **kwargs) 2022-11-23T03:12:18.6768633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6768737Z p_assert( 2022-11-23T03:12:18.6769064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6769173Z traceback.print_stack() 2022-11-23T03:12:18.6769299Z File "", line 1, in 2022-11-23T03:12:18.6769504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6769637Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6769832Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6769978Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6770187Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6770284Z self.run() 2022-11-23T03:12:18.6770469Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6770647Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6771017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6771147Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6771502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.6771617Z getattr(self, test_name)() 2022-11-23T03:12:18.6771966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.6772045Z fn() 2022-11-23T03:12:18.6772493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.6772517Z test(self, **param_kwargs) 2022-11-23T03:12:18.6772871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.6772988Z return func(*args, **kwargs) 2022-11-23T03:12:18.6773311Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T03:12:18.6773507Z self.run_subtests( 2022-11-23T03:12:18.6773778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.6773931Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.6774279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.6774421Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.6774792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.6774905Z output = model(*input) 2022-11-23T03:12:18.6775229Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.6775365Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.6775793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.6775968Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.6776314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.6776514Z _lazy_init(state, module) 2022-11-23T03:12:18.6776866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.6776916Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.6777245Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.6777365Z return func(*args, **kwargs) 2022-11-23T03:12:18.6777737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.6777835Z p_assert( 2022-11-23T03:12:18.6778158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.6778277Z traceback.print_stack() 2022-11-23T03:12:18.6778516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T03:12:18.6778749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 3 2022-11-23T03:12:18.6778976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 2 2022-11-23T03:12:18.6779200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T03:12:18.6779682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6779988Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6780370Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6780765Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:14 with 4 nodes. 2022-11-23T03:12:18.6780995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T03:12:18.6781219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 3 2022-11-23T03:12:18.6781443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 2 2022-11-23T03:12:18.6781827Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6782208Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6782567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T03:12:18.6783006Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6783355Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 4 nodes. 2022-11-23T03:12:18.6783542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T03:12:18.6783766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 3 2022-11-23T03:12:18.6784255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 2 2022-11-23T03:12:18.6784747Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6785193Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6785451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T03:12:18.6785854Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6786199Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 4 nodes. 2022-11-23T03:12:18.6786442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T03:12:18.6786675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 3 2022-11-23T03:12:18.6786895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 2 2022-11-23T03:12:18.6787286Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6787434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T03:12:18.6787820Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6788209Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6788588Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 4 nodes. 2022-11-23T03:12:18.6788823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T03:12:18.6789058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 3 2022-11-23T03:12:18.6789266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 2 2022-11-23T03:12:18.6789668Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6789916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T03:12:18.6790312Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6790709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6791104Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:18 with 4 nodes. 2022-11-23T03:12:18.6799192Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6800048Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6800795Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6801533Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6801843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 3 2022-11-23T03:12:18.6802078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T03:12:18.6802295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 2 2022-11-23T03:12:18.6802520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T03:12:18.6802921Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6803314Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6803703Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6804096Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:19 with 4 nodes. 2022-11-23T03:12:18.6804327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T03:12:18.6804556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 3 2022-11-23T03:12:18.6804777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 2 2022-11-23T03:12:18.6804985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T03:12:18.6805365Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6805748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6806141Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6806527Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:20 with 4 nodes. 2022-11-23T03:12:18.6806759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T03:12:18.6806984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 3 2022-11-23T03:12:18.6807212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 2 2022-11-23T03:12:18.6807602Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6807824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T03:12:18.6808258Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6808654Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6809036Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:21 with 4 nodes. 2022-11-23T03:12:18.6809260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 3 2022-11-23T03:12:18.6809486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T03:12:18.6809707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 2 2022-11-23T03:12:18.6810084Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6810517Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6810755Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T03:12:18.6811125Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6811508Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 4 nodes. 2022-11-23T03:12:18.6811736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 3 2022-11-23T03:12:18.6811954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T03:12:18.6812179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 2 2022-11-23T03:12:18.6812560Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6812954Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6813193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T03:12:18.6813574Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6813945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 4 nodes. 2022-11-23T03:12:18.6814171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 3 2022-11-23T03:12:18.6814392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T03:12:18.6814614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 2 2022-11-23T03:12:18.6815003Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6815384Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6815615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T03:12:18.6815998Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6816381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 4 nodes. 2022-11-23T03:12:18.6817173Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6817921Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6818655Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6818879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 3 2022-11-23T03:12:18.6819105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T03:12:18.6819400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 2 2022-11-23T03:12:18.6819622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T03:12:18.6820010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6820388Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6820769Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6821244Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:25 with 4 nodes. 2022-11-23T03:12:18.6821383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T03:12:18.6821614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 3 2022-11-23T03:12:18.6821826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 2 2022-11-23T03:12:18.6822212Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6822591Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6822826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T03:12:18.6823212Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6823596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 4 nodes. 2022-11-23T03:12:18.6823831Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T03:12:18.6824476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 3 2022-11-23T03:12:18.6824789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T03:12:18.6825175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6825413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 2 2022-11-23T03:12:18.6825798Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6826082Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6826458Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:27 with 4 nodes. 2022-11-23T03:12:18.6826770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T03:12:18.6827009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 3 2022-11-23T03:12:18.6827230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 2 2022-11-23T03:12:18.6827614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6827986Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6828226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T03:12:18.6828682Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6829072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 4 nodes. 2022-11-23T03:12:18.6829310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 3 2022-11-23T03:12:18.6829538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T03:12:18.6829766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 2 2022-11-23T03:12:18.6830152Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6830386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T03:12:18.6830766Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6831145Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6831528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 4 nodes. 2022-11-23T03:12:18.6831758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T03:12:18.6831985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 3 2022-11-23T03:12:18.6832205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 2 2022-11-23T03:12:18.6832584Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6832963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6833272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T03:12:18.6833654Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6833959Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 4 nodes. 2022-11-23T03:12:18.6834702Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6835440Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6836236Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6836981Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6837217Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T03:12:18.6837447Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 3 2022-11-23T03:12:18.6837726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 2 2022-11-23T03:12:18.6838114Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6838507Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6838744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T03:12:18.6839125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6839505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 4 nodes. 2022-11-23T03:12:18.6839721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 3 2022-11-23T03:12:18.6839950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T03:12:18.6840174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 2 2022-11-23T03:12:18.6840557Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6841028Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6841172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T03:12:18.6841552Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6841926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 4 nodes. 2022-11-23T03:12:18.6842162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 3 2022-11-23T03:12:18.6842378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T03:12:18.6842641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T03:12:18.6843052Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6843292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 2 2022-11-23T03:12:18.6843672Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6844054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6844431Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:33 with 4 nodes. 2022-11-23T03:12:18.6845220Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6845954Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6846681Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6846961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T03:12:18.6847185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 3 2022-11-23T03:12:18.6847396Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T03:12:18.6847788Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6848023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 2 2022-11-23T03:12:18.6848408Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6848799Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6849193Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:34 with 4 nodes. 2022-11-23T03:12:18.6849425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 2 2022-11-23T03:12:18.6849656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T03:12:18.6849884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 3 2022-11-23T03:12:18.6850271Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6850640Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6850876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T03:12:18.6851265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6851652Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:35 with 4 nodes. 2022-11-23T03:12:18.6851980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 3 2022-11-23T03:12:18.6852111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T03:12:18.6852334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T03:12:18.6852715Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6852947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 2 2022-11-23T03:12:18.6853364Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6853760Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6854232Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:36 with 4 nodes. 2022-11-23T03:12:18.6854877Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6855607Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6856391Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.6856632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 3 2022-11-23T03:12:18.6856957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T03:12:18.6857085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 2 2022-11-23T03:12:18.6857471Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6857702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T03:12:18.6858093Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6858464Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6858850Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 4 nodes. 2022-11-23T03:12:18.6858958Z dist init r=2, world=4 2022-11-23T03:12:18.6859280Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6859587Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6859889Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6860191Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6860489Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6860779Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6861073Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6861366Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6861699Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6862000Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6862295Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6862595Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.6862710Z dist init r=0, world=4 2022-11-23T03:12:18.6863029Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6863391Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6863694Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6864243Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6864557Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6864950Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6865242Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6865537Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6865821Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6866098Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6866437Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6866673Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.6866876Z dist init r=1, world=4 2022-11-23T03:12:18.6867202Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6867410Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6867711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6868006Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6868371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6868670Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6868967Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6869262Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6869633Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6869858Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6870214Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6870506Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.6870613Z dist init r=3, world=4 2022-11-23T03:12:18.6870978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6871290Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6871588Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6871881Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6872180Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6872483Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6872776Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6873066Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6873366Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6873660Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6874084Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6874380Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.6874478Z ok (31.880s) 2022-11-23T03:12:18.6874825Z test_nested_always_wrap_model_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22560 2022-11-23T03:12:18.6875032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22561 2022-11-23T03:12:18.6875301Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22562 2022-11-23T03:12:18.6875521Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22563 2022-11-23T03:12:18.6875908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6876083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6876458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6876646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6877051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6877176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6877608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6877792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6878155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6878325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6878696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6878884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6879238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6879408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6879768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6879954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6880190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.6880428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.6880659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.6880895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.6881292Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6881684Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6882073Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6882441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6882667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.6882884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.6883100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.6883316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.6883541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6883763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6884040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6884273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6885391Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6885391Z warnings.warn( 2022-11-23T03:12:18.6886402Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6886558Z warnings.warn( 2022-11-23T03:12:18.6887579Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6887690Z warnings.warn( 2022-11-23T03:12:18.6888759Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6888876Z warnings.warn( 2022-11-23T03:12:18.6889109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6889334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6889580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6889785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6890009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6890214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6890444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6890663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6890879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6891099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6891322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6891614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6891840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6892054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6892262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6892535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6892757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6892971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6893187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6893407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6893627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6893850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6894055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6894270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6894539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6894753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6894965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6895176Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6895392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6895609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6895815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6896030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6896251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6896603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6896837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6897033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6897254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6897469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6897690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6897894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6898108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6898325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6898542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6898763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6898974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6899194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6899413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6899614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6899837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6900054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6900323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6900628Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6900756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6900971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6901187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6901400Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6901605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6901823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6902091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6902307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6902531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6902747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6902965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6903181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6903386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6903599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6903816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6904420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6904660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6904879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6905085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6905311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6905502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6905735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6905944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6906145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6906399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6906592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6906841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6907052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6907263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6907467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6907602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6907819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6908037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6908330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6908558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6908804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6909000Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6909203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6909428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6909647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6909758Z dist init r=0, world=4 2022-11-23T03:12:18.6909869Z dist init r=1, world=4 2022-11-23T03:12:18.6909978Z dist init r=2, world=4 2022-11-23T03:12:18.6910146Z dist init r=3, world=4 2022-11-23T03:12:18.6910299Z ok (6.423s) 2022-11-23T03:12:18.6910584Z test_nested_always_wrap_model_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22861 2022-11-23T03:12:18.6910802Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22862 2022-11-23T03:12:18.6911020Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22863 2022-11-23T03:12:18.6911228Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22864 2022-11-23T03:12:18.6911608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6911780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6912157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6912346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6912698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6912875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6913245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6913428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6913785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6913951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6914316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6914495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6914846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6915017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6915392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6915575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6915809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.6916045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.6916321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.6916513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.6916954Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6917344Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6917725Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6918103Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6918330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.6918553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.6918771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.6919076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.6919308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6919539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6919750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6919976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6921050Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6921162Z warnings.warn( 2022-11-23T03:12:18.6922182Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6922287Z warnings.warn( 2022-11-23T03:12:18.6923287Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6923396Z warnings.warn( 2022-11-23T03:12:18.6924391Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6924498Z warnings.warn( 2022-11-23T03:12:18.6924719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6924944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6925169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6925385Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6925657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6925888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6926111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6926331Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6926553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6926774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6926999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6927221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6927482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6927701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6927917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6928137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6928354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6928574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6928797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6929111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6929227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6929453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6929672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6929891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6930104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6930323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6930540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6930759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6930962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6931183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6931407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6931625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6931845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6932066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6932280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6932499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6932725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6932929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6933151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6933423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6933712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6933873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6934098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6934317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6934535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6934738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6934958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6935247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6935466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6935685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6935905Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6936121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6936342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6936560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6936765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6936986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6937212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6937430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6937652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6937869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6938082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6938304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6938507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6938732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6938954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6939178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6939393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6939616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6939837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6940059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6940264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6940484Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6940702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6940925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6941191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6941492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6941646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6941896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6942091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6942299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6942549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6942799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6943080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6943299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6943518Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6943744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6944349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6944665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6944859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6945104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6945323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6945535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6945575Z dist init r=1, world=4 2022-11-23T03:12:18.6945780Z dist init r=2, world=4 2022-11-23T03:12:18.6945890Z dist init r=3, world=4 2022-11-23T03:12:18.6945970Z dist init r=0, world=4 2022-11-23T03:12:18.6946040Z ok (6.924s) 2022-11-23T03:12:18.6946345Z test_nested_always_wrap_model_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23162 2022-11-23T03:12:18.6946559Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23163 2022-11-23T03:12:18.6946768Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23164 2022-11-23T03:12:18.6947022Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23165 2022-11-23T03:12:18.6947377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6947564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6947926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6948117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6948484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6948659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6949036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6949228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6949596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6949853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6950225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6950417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6950773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6950944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6951314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6951497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6951740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.6951982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.6952296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.6952569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.6952921Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6953316Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6953702Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6954084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6954313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.6954547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.6954772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.6954996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.6955213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6955444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6955671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6955896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6956918Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6957034Z warnings.warn( 2022-11-23T03:12:18.6958045Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6958154Z warnings.warn( 2022-11-23T03:12:18.6959208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6959326Z warnings.warn( 2022-11-23T03:12:18.6960333Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6960442Z warnings.warn( 2022-11-23T03:12:18.6960673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6960939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6961174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6961407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6961649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6961866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6962095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6962320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6962544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6962771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6962982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6963200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6963422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6963646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6963864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6964090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6964316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6964536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6964750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6964974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6965196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6965416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6965634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6965862Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6966083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6966303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6966522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6966778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6967008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6967222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6967439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6967657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6967872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6968090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6968307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6968511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6968780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6969001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6969221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6969438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6969659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6969879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6970099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6970303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6970529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6970804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6971033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6971251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6971471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6971687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6971908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6972125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6972330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6972550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6972765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6972984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6973204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6973420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6973640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6973864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6974107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6974356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6974633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6974867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6975091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6975316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6975544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6975768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6975974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6976197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6976423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6976694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6976918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6977141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6977440Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6977588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6977814Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6978117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6978245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6978462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6978690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6978912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6979135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6979351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6979572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6979776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6979998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6980217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6980445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6980666Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6980890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6981112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6981228Z dist init r=2, world=4 2022-11-23T03:12:18.6981321Z dist init r=1, world=4 2022-11-23T03:12:18.6981435Z dist init r=0, world=4 2022-11-23T03:12:18.6981543Z dist init r=3, world=4 2022-11-23T03:12:18.6981645Z ok (6.724s) 2022-11-23T03:12:18.6982006Z test_nested_always_wrap_model_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23463 2022-11-23T03:12:18.6982227Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23464 2022-11-23T03:12:18.6982499Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23465 2022-11-23T03:12:18.6982725Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23466 2022-11-23T03:12:18.6983089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6983265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6983639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6983827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6984568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6984751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6985125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6985394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6985765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6985943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6986266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.6986465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.6986798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6987000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6987370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.6987488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.6987732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.6987959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.6988203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.6988443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.6988836Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6989228Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6989612Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6990043Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.6990273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.6990501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.6990707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.6990931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.6991227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6991412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6991630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6991951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.6992959Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6993074Z warnings.warn( 2022-11-23T03:12:18.6994086Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6994273Z warnings.warn( 2022-11-23T03:12:18.6995280Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6995391Z warnings.warn( 2022-11-23T03:12:18.6996402Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.6996515Z warnings.warn( 2022-11-23T03:12:18.6996627Z File "", line 1, in 2022-11-23T03:12:18.6996840Z File "", line 1, in 2022-11-23T03:12:18.6996961Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6997103Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6997307Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6997459Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6997672Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.6997795Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.6998012Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6998117Z self.run() 2022-11-23T03:12:18.6998322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.6998471Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.6998676Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6998830Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.6999043Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.6999128Z self.run() 2022-11-23T03:12:18.6999477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.6999615Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.6999815Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.6999969Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7000381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7000514Z getattr(self, test_name)() 2022-11-23T03:12:18.7000839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7000974Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7001337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7001438Z fn() 2022-11-23T03:12:18.7001800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7001929Z getattr(self, test_name)() 2022-11-23T03:12:18.7002295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7002418Z test(self, **param_kwargs) 2022-11-23T03:12:18.7002808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7002909Z fn() 2022-11-23T03:12:18.7003268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7003392Z return func(*args, **kwargs) 2022-11-23T03:12:18.7003751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7003872Z test(self, **param_kwargs) 2022-11-23T03:12:18.7004125Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7004241Z self.run_subtests( 2022-11-23T03:12:18.7004584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7004709Z return func(*args, **kwargs) 2022-11-23T03:12:18.7005060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7005230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7005485Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7005600Z self.run_subtests( 2022-11-23T03:12:18.7005962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7006116Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7006449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7006610Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7006985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7007111Z output = model(*input) 2022-11-23T03:12:18.7007479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7007632Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7007960Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7008105Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7008463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7008584Z output = model(*input) 2022-11-23T03:12:18.7008965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7009231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7009473Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7009619Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7010033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7010159Z _lazy_init(state, module) 2022-11-23T03:12:18.7010518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7010695Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7011048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7011185Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7011537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7011661Z _lazy_init(state, module) 2022-11-23T03:12:18.7012003Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7012184Z return func(*args, **kwargs) 2022-11-23T03:12:18.7012513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7012658Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7013034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7013137Z p_assert( 2022-11-23T03:12:18.7013579Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7013600Z return func(*args, **kwargs) 2022-11-23T03:12:18.7013936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7014063Z traceback.print_stack() 2022-11-23T03:12:18.7014493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7014531Z p_assert( 2022-11-23T03:12:18.7014862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7014989Z traceback.print_stack() 2022-11-23T03:12:18.7015122Z File "", line 1, in 2022-11-23T03:12:18.7015330Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7015475Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7015681Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7015813Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7016024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7016128Z self.run() 2022-11-23T03:12:18.7016332Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7016490Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7016836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7016973Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7017316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7017442Z getattr(self, test_name)() 2022-11-23T03:12:18.7017803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7017902Z fn() 2022-11-23T03:12:18.7018269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7018395Z test(self, **param_kwargs) 2022-11-23T03:12:18.7018751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7018880Z return func(*args, **kwargs) 2022-11-23T03:12:18.7019165Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7019287Z self.run_subtests( 2022-11-23T03:12:18.7019636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7019797Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7020158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7020314Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7020681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7020796Z output = model(*input) 2022-11-23T03:12:18.7021103Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7021290Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7021667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7021933Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7022242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7022367Z _lazy_init(state, module) 2022-11-23T03:12:18.7022715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7022856Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7023175Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7023301Z return func(*args, **kwargs) 2022-11-23T03:12:18.7023681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7023793Z p_assert( 2022-11-23T03:12:18.7024345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7024472Z traceback.print_stack() 2022-11-23T03:12:18.7024603Z File "", line 1, in 2022-11-23T03:12:18.7024897Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7025043Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7025247Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7025404Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7025613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7025715Z self.run() 2022-11-23T03:12:18.7025925Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7026069Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7026336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7026454Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7026815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7026945Z getattr(self, test_name)() 2022-11-23T03:12:18.7027303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7027406Z fn() 2022-11-23T03:12:18.7027773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7027899Z test(self, **param_kwargs) 2022-11-23T03:12:18.7028254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7028364Z return func(*args, **kwargs) 2022-11-23T03:12:18.7028692Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7028819Z self.run_subtests( 2022-11-23T03:12:18.7029178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7029339Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7029702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7029855Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7030226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7030328Z output = model(*input) 2022-11-23T03:12:18.7030656Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7030867Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7031247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7031425Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7031793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7031912Z _lazy_init(state, module) 2022-11-23T03:12:18.7032260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7032385Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7032725Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7032852Z return func(*args, **kwargs) 2022-11-23T03:12:18.7033238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7033342Z p_assert( 2022-11-23T03:12:18.7033675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7033808Z traceback.print_stack() 2022-11-23T03:12:18.7034052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7034271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7034506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7034739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7034871Z File "", line 1, in 2022-11-23T03:12:18.7035083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7035231Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7035440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7035572Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7035788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7035949Z self.run() 2022-11-23T03:12:18.7036096Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7036244Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7036619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7036818Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7037117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7037193Z getattr(self, test_name)() 2022-11-23T03:12:18.7037645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7037716Z fn() 2022-11-23T03:12:18.7038082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7038201Z test(self, **param_kwargs) 2022-11-23T03:12:18.7038555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7038679Z return func(*args, **kwargs) 2022-11-23T03:12:18.7038928Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7039022Z self.run_subtests( 2022-11-23T03:12:18.7039371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7039535Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7039958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7040115Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7040485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7040607Z output = model(*input) 2022-11-23T03:12:18.7040934Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7041056Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7041436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7041619Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7041985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7042109Z _lazy_init(state, module) 2022-11-23T03:12:18.7042465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7042611Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7043009Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7043118Z return func(*args, **kwargs) 2022-11-23T03:12:18.7043502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7043609Z p_assert( 2022-11-23T03:12:18.7043947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7044077Z traceback.print_stack() 2022-11-23T03:12:18.7044211Z File "", line 1, in 2022-11-23T03:12:18.7044420Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7044568Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7044756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7044907Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7045124Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7045230Z self.run() 2022-11-23T03:12:18.7045429Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7045574Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7045916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7046029Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7046392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7046520Z getattr(self, test_name)() 2022-11-23T03:12:18.7046927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7047031Z fn() 2022-11-23T03:12:18.7047404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7047528Z test(self, **param_kwargs) 2022-11-23T03:12:18.7047880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7047987Z return func(*args, **kwargs) 2022-11-23T03:12:18.7048250Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7048360Z self.run_subtests( 2022-11-23T03:12:18.7048710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7048876Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7049297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7049451Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7049830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7049932Z output = model(*input) 2022-11-23T03:12:18.7050261Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7050409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7050782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7050958Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7051322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7051449Z _lazy_init(state, module) 2022-11-23T03:12:18.7051803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7051928Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7052291Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7052418Z return func(*args, **kwargs) 2022-11-23T03:12:18.7052877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7052901Z p_assert( 2022-11-23T03:12:18.7053237Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7053368Z traceback.print_stack() 2022-11-23T03:12:18.7053500Z File "", line 1, in 2022-11-23T03:12:18.7053688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7053839Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7054043Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7054195Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7054410Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7054517Z self.run() 2022-11-23T03:12:18.7054718Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7054865Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7055188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7055324Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7055681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7055808Z getattr(self, test_name)() 2022-11-23T03:12:18.7056260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7056369Z fn() 2022-11-23T03:12:18.7056740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7056844Z test(self, **param_kwargs) 2022-11-23T03:12:18.7057200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7057326Z return func(*args, **kwargs) 2022-11-23T03:12:18.7057587Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7057734Z self.run_subtests( 2022-11-23T03:12:18.7058055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7058219Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7058662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7058889Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7059158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7059280Z output = model(*input) 2022-11-23T03:12:18.7059607Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7059753Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7060129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7060308Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7060671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7060798Z _lazy_init(state, module) 2022-11-23T03:12:18.7061136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7061282Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7061623Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7061749Z return func(*args, **kwargs) 2022-11-23T03:12:18.7062215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7062297Z p_assert( 2022-11-23T03:12:18.7062636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7062743Z traceback.print_stack() 2022-11-23T03:12:18.7062872Z File "", line 1, in 2022-11-23T03:12:18.7063085Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7063234Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7063517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7063597Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7063807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7064315Z self.run() 2022-11-23T03:12:18.7064518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7064672Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7065003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7065153Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7065515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7065644Z getattr(self, test_name)() 2022-11-23T03:12:18.7065978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7066086Z fn() 2022-11-23T03:12:18.7066436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7066559Z test(self, **param_kwargs) 2022-11-23T03:12:18.7066907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7067032Z return func(*args, **kwargs) 2022-11-23T03:12:18.7067292Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7067407Z self.run_subtests( 2022-11-23T03:12:18.7067761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7067993Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7068341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7068503Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7068880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7069004Z output = model(*input) 2022-11-23T03:12:18.7069330Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7069474Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7069949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7070070Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7070417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7070595Z _lazy_init(state, module) 2022-11-23T03:12:18.7071055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7071174Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7071552Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7071650Z return func(*args, **kwargs) 2022-11-23T03:12:18.7071973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7072080Z p_assert( 2022-11-23T03:12:18.7072396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7072523Z traceback.print_stack() 2022-11-23T03:12:18.7072852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7073009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7073246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7073476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7073615Z File "", line 1, in 2022-11-23T03:12:18.7073852Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7073956Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7074160Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7074310Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7074524Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7074628Z self.run() 2022-11-23T03:12:18.7074933Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7075047Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7075385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7075524Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7075888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7076014Z getattr(self, test_name)() 2022-11-23T03:12:18.7076370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7076472Z fn() 2022-11-23T03:12:18.7076834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7076961Z test(self, **param_kwargs) 2022-11-23T03:12:18.7077071Z File "", line 1, in 2022-11-23T03:12:18.7077487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7077616Z return func(*args, **kwargs) 2022-11-23T03:12:18.7077898Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7077974Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7078232Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7078423Z self.run_subtests( 2022-11-23T03:12:18.7078554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7078687Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7079039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7079207Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7079420Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7079537Z self.run() 2022-11-23T03:12:18.7079991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7080063Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7080271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7080398Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7080783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7080902Z output = model(*input) 2022-11-23T03:12:18.7081241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7081377Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7081702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7081856Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7082200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7082423Z getattr(self, test_name)() 2022-11-23T03:12:18.7082803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7082971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7083270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7083441Z fn() 2022-11-23T03:12:18.7083714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7083836Z _lazy_init(state, module) 2022-11-23T03:12:18.7084178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7084355Z test(self, **param_kwargs) 2022-11-23T03:12:18.7084720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7084859Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7085212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7085404Z return func(*args, **kwargs) 2022-11-23T03:12:18.7085836Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7085871Z return func(*args, **kwargs) 2022-11-23T03:12:18.7086111Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7086226Z self.run_subtests( 2022-11-23T03:12:18.7086606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7086765Z p_assert( 2022-11-23T03:12:18.7087117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7087283Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7087620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7087747Z traceback.print_stack() 2022-11-23T03:12:18.7088088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7088242Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7088620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7088739Z output = model(*input) 2022-11-23T03:12:18.7089068Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7089218Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7089597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7089775Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7090144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7090248Z _lazy_init(state, module) 2022-11-23T03:12:18.7090380Z File "", line 1, in 2022-11-23T03:12:18.7090733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7090878Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7091216Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7091346Z return func(*args, **kwargs) 2022-11-23T03:12:18.7091605Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7091787Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7092066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7092272Z p_assert( 2022-11-23T03:12:18.7092379Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7092530Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7092868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7092996Z traceback.print_stack() 2022-11-23T03:12:18.7093212Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7093296Z self.run() 2022-11-23T03:12:18.7093502Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7093699Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7094045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7094180Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7094544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7094670Z getattr(self, test_name)() 2022-11-23T03:12:18.7095018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7095099Z fn() 2022-11-23T03:12:18.7095469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7095594Z test(self, **param_kwargs) 2022-11-23T03:12:18.7095951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7096138Z return func(*args, **kwargs) 2022-11-23T03:12:18.7096392Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7096505Z self.run_subtests( 2022-11-23T03:12:18.7096842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7097006Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7097365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7097514Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7097887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7098007Z output = model(*input) 2022-11-23T03:12:18.7098341Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7098486Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7098840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7099014Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7099380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7099506Z _lazy_init(state, module) 2022-11-23T03:12:18.7099854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7099995Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7100331Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7100460Z return func(*args, **kwargs) 2022-11-23T03:12:18.7100821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7100926Z p_assert( 2022-11-23T03:12:18.7101263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7101392Z traceback.print_stack() 2022-11-23T03:12:18.7101524Z File "", line 1, in 2022-11-23T03:12:18.7101737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7101884Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7102084Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7102217Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7102432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7102539Z self.run() 2022-11-23T03:12:18.7102794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7102951Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7103297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7103433Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7103793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7104118Z getattr(self, test_name)() 2022-11-23T03:12:18.7104591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7104661Z fn() 2022-11-23T03:12:18.7105122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7105243Z test(self, **param_kwargs) 2022-11-23T03:12:18.7105694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7105813Z return func(*args, **kwargs) 2022-11-23T03:12:18.7106058Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7106169Z self.run_subtests( 2022-11-23T03:12:18.7106427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7106588Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7106949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7107104Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7107474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7107601Z output = model(*input) 2022-11-23T03:12:18.7107908Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7108050Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7108426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7108603Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7108965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7109089Z _lazy_init(state, module) 2022-11-23T03:12:18.7109437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7109630Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7109922Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7110033Z return func(*args, **kwargs) 2022-11-23T03:12:18.7110416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7110519Z p_assert( 2022-11-23T03:12:18.7110855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7110980Z traceback.print_stack() 2022-11-23T03:12:18.7111219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7111453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7111683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7111897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7112027Z File "", line 1, in 2022-11-23T03:12:18.7112242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7112444Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7112657Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7112808Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7113022Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7113107Z self.run() 2022-11-23T03:12:18.7113315Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7113461Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7113805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7113939Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7114296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7114469Z getattr(self, test_name)() 2022-11-23T03:12:18.7114836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7114916Z fn() 2022-11-23T03:12:18.7115281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7115412Z test(self, **param_kwargs) 2022-11-23T03:12:18.7115769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7115895Z return func(*args, **kwargs) 2022-11-23T03:12:18.7116151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7116263Z self.run_subtests( 2022-11-23T03:12:18.7116613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7116761Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7117151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7117280Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7117656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7117774Z output = model(*input) 2022-11-23T03:12:18.7118098Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7118236Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7118607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7118765Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7119128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7119257Z _lazy_init(state, module) 2022-11-23T03:12:18.7119611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7119753Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7120088Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7120211Z return func(*args, **kwargs) 2022-11-23T03:12:18.7120589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7120671Z p_assert( 2022-11-23T03:12:18.7121006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7121131Z traceback.print_stack() 2022-11-23T03:12:18.7121260Z File "", line 1, in 2022-11-23T03:12:18.7121477Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7121663Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7121869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7122012Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7122207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7122309Z self.run() 2022-11-23T03:12:18.7122510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7122654Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7122996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7123132Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7123489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7123657Z getattr(self, test_name)() 2022-11-23T03:12:18.7124023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7124126Z fn() 2022-11-23T03:12:18.7124490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7124612Z test(self, **param_kwargs) 2022-11-23T03:12:18.7124968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7125095Z return func(*args, **kwargs) 2022-11-23T03:12:18.7125355Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7125449Z self.run_subtests( 2022-11-23T03:12:18.7125803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7125970Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7126330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7126482Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7126852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7126975Z output = model(*input) 2022-11-23T03:12:18.7127104Z File "", line 1, in 2022-11-23T03:12:18.7127412Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7127556Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7127950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7128110Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7128327Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7128468Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7128837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7128956Z _lazy_init(state, module) 2022-11-23T03:12:18.7129141Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7129290Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7129640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7129786Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7130006Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7130119Z self.run() 2022-11-23T03:12:18.7130440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7130615Z return func(*args, **kwargs) 2022-11-23T03:12:18.7130806Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7130958Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7131334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7131437Z p_assert( 2022-11-23T03:12:18.7131774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7131909Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7132238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7132345Z traceback.print_stack() 2022-11-23T03:12:18.7132707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7132880Z getattr(self, test_name)() 2022-11-23T03:12:18.7133244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7133343Z fn() 2022-11-23T03:12:18.7133704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7133826Z test(self, **param_kwargs) 2022-11-23T03:12:18.7134178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7134284Z return func(*args, **kwargs) 2022-11-23T03:12:18.7134537Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7134650Z self.run_subtests( 2022-11-23T03:12:18.7135004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7135174Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7135627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7135689Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7136066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7136168Z output = model(*input) 2022-11-23T03:12:18.7136489Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7136634Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7137007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7137185Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7137556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7137678Z _lazy_init(state, module) 2022-11-23T03:12:18.7138026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7138170Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7138487Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7138612Z return func(*args, **kwargs) 2022-11-23T03:12:18.7138987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7139090Z p_assert( 2022-11-23T03:12:18.7139425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7139554Z traceback.print_stack() 2022-11-23T03:12:18.7139689Z File "", line 1, in 2022-11-23T03:12:18.7139927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7140077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7140281Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7140434Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7140649Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7140751Z self.run() 2022-11-23T03:12:18.7140955Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7141095Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7141414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7141549Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7141905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7142081Z getattr(self, test_name)() 2022-11-23T03:12:18.7142442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7142542Z fn() 2022-11-23T03:12:18.7142958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7143082Z test(self, **param_kwargs) 2022-11-23T03:12:18.7143419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7143544Z return func(*args, **kwargs) 2022-11-23T03:12:18.7143805Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7144167Z self.run_subtests( 2022-11-23T03:12:18.7144557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7144743Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7145112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7145298Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7145585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7145806Z output = model(*input) 2022-11-23T03:12:18.7146136Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7146259Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7146634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7146815Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7147121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7147223Z _lazy_init(state, module) 2022-11-23T03:12:18.7147555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7147700Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7148035Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7148160Z return func(*args, **kwargs) 2022-11-23T03:12:18.7148539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7148640Z p_assert( 2022-11-23T03:12:18.7148981Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7149107Z traceback.print_stack() 2022-11-23T03:12:18.7149330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7149642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7149891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7150123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7150256Z File "", line 1, in 2022-11-23T03:12:18.7150468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7150613Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7150796Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7150950Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7151165Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7151328Z self.run() 2022-11-23T03:12:18.7151536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7151683Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7152032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7152167Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7152509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7152637Z getattr(self, test_name)() 2022-11-23T03:12:18.7153000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7153098Z fn() 2022-11-23T03:12:18.7153503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7153589Z test(self, **param_kwargs) 2022-11-23T03:12:18.7153956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7154079Z return func(*args, **kwargs) 2022-11-23T03:12:18.7154316Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7154530Z self.run_subtests( 2022-11-23T03:12:18.7154783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7154946Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7155308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7155463Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7155836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7155961Z output = model(*input) 2022-11-23T03:12:18.7156270Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7156412Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7156790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7156971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7157337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7157457Z _lazy_init(state, module) 2022-11-23T03:12:18.7157856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7158098Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7158320Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7158452Z return func(*args, **kwargs) 2022-11-23T03:12:18.7158882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7158991Z p_assert( 2022-11-23T03:12:18.7159331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7159456Z traceback.print_stack() 2022-11-23T03:12:18.7159586Z File "", line 1, in 2022-11-23T03:12:18.7159798Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7159921Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7160126Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7160277Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7160492Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7160652Z self.run() 2022-11-23T03:12:18.7160857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7161002Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7161321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7161453Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7161810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7161935Z getattr(self, test_name)() 2022-11-23T03:12:18.7162066Z File "", line 1, in 2022-11-23T03:12:18.7162422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7162521Z fn() 2022-11-23T03:12:18.7162939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7163000Z test(self, **param_kwargs) 2022-11-23T03:12:18.7163216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7163360Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7163721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7163848Z return func(*args, **kwargs) 2022-11-23T03:12:18.7164051Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7164203Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7164460Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7164554Z self.run_subtests( 2022-11-23T03:12:18.7164769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7164875Z self.run() 2022-11-23T03:12:18.7165236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7165402Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7165604Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7165749Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7166111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7166246Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7166581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7166714Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7167089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7167217Z output = model(*input) 2022-11-23T03:12:18.7167627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7167759Z getattr(self, test_name)() 2022-11-23T03:12:18.7168084Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7168206Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7168565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7168663Z fn() 2022-11-23T03:12:18.7169039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7169218Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7169585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7169760Z test(self, **param_kwargs) 2022-11-23T03:12:18.7170133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7170237Z _lazy_init(state, module) 2022-11-23T03:12:18.7170594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7170720Z return func(*args, **kwargs) 2022-11-23T03:12:18.7171134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7171280Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7171537Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7171650Z self.run_subtests( 2022-11-23T03:12:18.7171988Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7172098Z return func(*args, **kwargs) 2022-11-23T03:12:18.7172456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7172622Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7173004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7173108Z p_assert( 2022-11-23T03:12:18.7173469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7173624Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7173959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7174068Z traceback.print_stack() 2022-11-23T03:12:18.7174443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7174571Z output = model(*input) 2022-11-23T03:12:18.7174986Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7175128Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7175505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7175684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7176130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7176158Z _lazy_init(state, module) 2022-11-23T03:12:18.7176513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7176656Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7176995Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7177187Z return func(*args, **kwargs) 2022-11-23T03:12:18.7177580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7177681Z p_assert( 2022-11-23T03:12:18.7178020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7178130Z traceback.print_stack() 2022-11-23T03:12:18.7178280Z File "", line 1, in 2022-11-23T03:12:18.7178478Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7178623Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7178829Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7178982Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7179198Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7179333Z self.run() 2022-11-23T03:12:18.7179542Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7179692Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7180030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7180166Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7180524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7180651Z getattr(self, test_name)() 2022-11-23T03:12:18.7181011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7181091Z fn() 2022-11-23T03:12:18.7181457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7181586Z test(self, **param_kwargs) 2022-11-23T03:12:18.7181948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7182075Z return func(*args, **kwargs) 2022-11-23T03:12:18.7182333Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7182446Z self.run_subtests( 2022-11-23T03:12:18.7182801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7182945Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7183308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7183460Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7183833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7184387Z output = model(*input) 2022-11-23T03:12:18.7184814Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7184952Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7185334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7185502Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7185870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7185982Z _lazy_init(state, module) 2022-11-23T03:12:18.7186251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7186397Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7186733Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7186932Z return func(*args, **kwargs) 2022-11-23T03:12:18.7187326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7187410Z p_assert( 2022-11-23T03:12:18.7187744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7187870Z traceback.print_stack() 2022-11-23T03:12:18.7188110Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7188348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7188643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7188880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7189094Z File "", line 1, in 2022-11-23T03:12:18.7189292Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7189436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7189640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7189796Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7190088Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7190120Z self.run() 2022-11-23T03:12:18.7190323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7190450Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7190799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7190935Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7191305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7191432Z getattr(self, test_name)() 2022-11-23T03:12:18.7191792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7191892Z fn() 2022-11-23T03:12:18.7192260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7192364Z test(self, **param_kwargs) 2022-11-23T03:12:18.7192743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7192851Z return func(*args, **kwargs) 2022-11-23T03:12:18.7192992Z File "", line 1, in 2022-11-23T03:12:18.7193248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7193365Z self.run_subtests( 2022-11-23T03:12:18.7193726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7193890Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7194081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7194223Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7194586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7194740Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7194944Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7195100Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7195475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7195596Z output = model(*input) 2022-11-23T03:12:18.7195846Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7195963Z self.run() 2022-11-23T03:12:18.7196292Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7196434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7196640Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7196786Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7197246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7197333Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7197649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7197785Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7198249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7198371Z _lazy_init(state, module) 2022-11-23T03:12:18.7198729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7198856Z getattr(self, test_name)() 2022-11-23T03:12:18.7199209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7199357Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7199695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7199792Z fn() 2022-11-23T03:12:18.7200126Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7200253Z return func(*args, **kwargs) 2022-11-23T03:12:18.7200386Z File "", line 1, in 2022-11-23T03:12:18.7200757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7200882Z test(self, **param_kwargs) 2022-11-23T03:12:18.7201243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7201350Z p_assert( 2022-11-23T03:12:18.7201705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7201832Z return func(*args, **kwargs) 2022-11-23T03:12:18.7202039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7202186Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7202526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7202653Z traceback.print_stack() 2022-11-23T03:12:18.7202898Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7203017Z self.run_subtests( 2022-11-23T03:12:18.7203221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7203373Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7203729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7203892Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7204108Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7204214Z self.run() 2022-11-23T03:12:18.7204557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7204720Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7204923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7205116Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7205504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7205624Z output = model(*input) 2022-11-23T03:12:18.7205961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7206097Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7206403Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7206546Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7206905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7207033Z getattr(self, test_name)() 2022-11-23T03:12:18.7207409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7207641Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7208100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7208113Z fn() 2022-11-23T03:12:18.7208449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7208571Z _lazy_init(state, module) 2022-11-23T03:12:18.7208936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7209063Z test(self, **param_kwargs) 2022-11-23T03:12:18.7209410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7209558Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7209996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7210052Z return func(*args, **kwargs) 2022-11-23T03:12:18.7210371Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7210497Z return func(*args, **kwargs) 2022-11-23T03:12:18.7210752Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7210940Z self.run_subtests( 2022-11-23T03:12:18.7211247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7211353Z p_assert( 2022-11-23T03:12:18.7211706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7211870Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7212196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7212324Z traceback.print_stack() 2022-11-23T03:12:18.7212690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7212846Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7213220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7213340Z output = model(*input) 2022-11-23T03:12:18.7213664Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7213808Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7214167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7214346Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7214758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7214888Z _lazy_init(state, module) 2022-11-23T03:12:18.7215243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7215388Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7215726Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7215851Z return func(*args, **kwargs) 2022-11-23T03:12:18.7216210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7216315Z p_assert( 2022-11-23T03:12:18.7216650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7216825Z traceback.print_stack() 2022-11-23T03:12:18.7216961Z File "", line 1, in 2022-11-23T03:12:18.7217175Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7217319Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7217517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7217656Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7217867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7217968Z self.run() 2022-11-23T03:12:18.7218168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7218312Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7218650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7218780Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7219128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7219249Z getattr(self, test_name)() 2022-11-23T03:12:18.7219606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7219706Z fn() 2022-11-23T03:12:18.7220068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7220187Z test(self, **param_kwargs) 2022-11-23T03:12:18.7220547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7220664Z return func(*args, **kwargs) 2022-11-23T03:12:18.7220952Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7221074Z self.run_subtests( 2022-11-23T03:12:18.7221439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7221602Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7221962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7222119Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7222524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7222620Z output = model(*input) 2022-11-23T03:12:18.7222929Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7223073Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7223450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7223628Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7224318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7224558Z _lazy_init(state, module) 2022-11-23T03:12:18.7224913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7225055Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7225372Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7225415Z return func(*args, **kwargs) 2022-11-23T03:12:18.7225882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7225985Z p_assert( 2022-11-23T03:12:18.7226238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7226434Z traceback.print_stack() 2022-11-23T03:12:18.7226677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7226914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7227130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7227362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7227499Z File "", line 1, in 2022-11-23T03:12:18.7227711Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7227960Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7228059Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7228212Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7228427Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7228522Z self.run() 2022-11-23T03:12:18.7228728Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7228879Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7229228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7229363Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7229725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7229851Z getattr(self, test_name)() 2022-11-23T03:12:18.7230188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7230376Z fn() 2022-11-23T03:12:18.7230658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7230787Z test(self, **param_kwargs) 2022-11-23T03:12:18.7231144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7231271Z return func(*args, **kwargs) 2022-11-23T03:12:18.7231531Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7231647Z self.run_subtests( 2022-11-23T03:12:18.7231980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7232141Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7232507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7232661Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7233039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7233213Z output = model(*input) 2022-11-23T03:12:18.7233551Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7233693Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7234051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7234231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7234597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7234716Z _lazy_init(state, module) 2022-11-23T03:12:18.7235069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7235286Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7235609Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7235734Z return func(*args, **kwargs) 2022-11-23T03:12:18.7236115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7236200Z p_assert( 2022-11-23T03:12:18.7236541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7236669Z traceback.print_stack() 2022-11-23T03:12:18.7236801Z File "", line 1, in 2022-11-23T03:12:18.7237012Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7237156Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7237359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7237490Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7237710Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7237819Z self.run() 2022-11-23T03:12:18.7238021Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7238167Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7238508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7238642Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7239004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7239110Z getattr(self, test_name)() 2022-11-23T03:12:18.7239469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7239570Z fn() 2022-11-23T03:12:18.7239936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7240066Z test(self, **param_kwargs) 2022-11-23T03:12:18.7240425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7240551Z return func(*args, **kwargs) 2022-11-23T03:12:18.7240788Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7240904Z self.run_subtests( 2022-11-23T03:12:18.7241261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7241425Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7241789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7241945Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7242396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7242631Z output = model(*input) 2022-11-23T03:12:18.7242970Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7243093Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7243468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7243645Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7244070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7244139Z _lazy_init(state, module) 2022-11-23T03:12:18.7244489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7244632Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7245028Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7245135Z return func(*args, **kwargs) 2022-11-23T03:12:18.7245520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7245626Z p_assert( 2022-11-23T03:12:18.7245967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7246101Z traceback.print_stack() 2022-11-23T03:12:18.7246236Z File "", line 1, in 2022-11-23T03:12:18.7246446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7246570Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7246779Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7246933Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7247155Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7247263Z self.run() 2022-11-23T03:12:18.7247470Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7247663Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7247960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7248077Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7248443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7248564Z getattr(self, test_name)() 2022-11-23T03:12:18.7248924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7249029Z fn() 2022-11-23T03:12:18.7249392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7249526Z test(self, **param_kwargs) 2022-11-23T03:12:18.7249880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7249987Z return func(*args, **kwargs) 2022-11-23T03:12:18.7250243Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7250361Z self.run_subtests( 2022-11-23T03:12:18.7250714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7250878Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7251240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7251396Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7251815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7251929Z output = model(*input) 2022-11-23T03:12:18.7252259Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7252402Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7252774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7252953Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7253317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7253437Z _lazy_init(state, module) 2022-11-23T03:12:18.7253787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7253981Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7254324Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7254450Z return func(*args, **kwargs) 2022-11-23T03:12:18.7254826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7254932Z p_assert( 2022-11-23T03:12:18.7255268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7255394Z traceback.print_stack() 2022-11-23T03:12:18.7255524Z File "", line 1, in 2022-11-23T03:12:18.7255716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7255859Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7256061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7256214Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7256438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7256545Z self.run() 2022-11-23T03:12:18.7256749Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7256874Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7257213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7257350Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7257711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7257835Z getattr(self, test_name)() 2022-11-23T03:12:18.7258190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7258288Z fn() 2022-11-23T03:12:18.7258651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7258762Z test(self, **param_kwargs) 2022-11-23T03:12:18.7259122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7259249Z return func(*args, **kwargs) 2022-11-23T03:12:18.7259507Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7259620Z self.run_subtests( 2022-11-23T03:12:18.7259974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7260140Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7260504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7260638Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7261063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7261191Z output = model(*input) 2022-11-23T03:12:18.7261518Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7261660Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7262037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7262213Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7262579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7262681Z _lazy_init(state, module) 2022-11-23T03:12:18.7263030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7263320Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7263568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7263695Z return func(*args, **kwargs) 2022-11-23T03:12:18.7264452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7264599Z p_assert( 2022-11-23T03:12:18.7264950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7265046Z traceback.print_stack() 2022-11-23T03:12:18.7265271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7265521Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7265740Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7265963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7266024Z File "", line 1, in 2022-11-23T03:12:18.7266232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7266372Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7266557Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7266704Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7266914Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7267015Z self.run() 2022-11-23T03:12:18.7267213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7267356Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7267696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7267814Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7268179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7268300Z getattr(self, test_name)() 2022-11-23T03:12:18.7268657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7268750Z fn() 2022-11-23T03:12:18.7269111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7269233Z test(self, **param_kwargs) 2022-11-23T03:12:18.7269584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7269691Z return func(*args, **kwargs) 2022-11-23T03:12:18.7269948Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7270067Z self.run_subtests( 2022-11-23T03:12:18.7270486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7270660Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7271088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7271247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7271623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7271723Z output = model(*input) 2022-11-23T03:12:18.7272049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7272190Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7272566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7272809Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7273173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7273292Z _lazy_init(state, module) 2022-11-23T03:12:18.7273637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7273775Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7274093Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7274215Z return func(*args, **kwargs) 2022-11-23T03:12:18.7274599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7274686Z p_assert( 2022-11-23T03:12:18.7275019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7275145Z traceback.print_stack() 2022-11-23T03:12:18.7275276Z File "", line 1, in 2022-11-23T03:12:18.7275467Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7275606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7275894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7275953Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7276163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7276263Z self.run() 2022-11-23T03:12:18.7276461Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7276604Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7276714Z File "", line 1, in 2022-11-23T03:12:18.7277052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7277192Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7277549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7277670Z getattr(self, test_name)() 2022-11-23T03:12:18.7277875Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7278013Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7278381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7278449Z fn() 2022-11-23T03:12:18.7278684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7278795Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7279260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7279386Z test(self, **param_kwargs) 2022-11-23T03:12:18.7279644Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7279750Z self.run() 2022-11-23T03:12:18.7280093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7280215Z return func(*args, **kwargs) 2022-11-23T03:12:18.7280414Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7280557Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7280813Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7280928Z self.run_subtests( 2022-11-23T03:12:18.7281264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7281396Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7281782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7281944Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7282303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7282424Z getattr(self, test_name)() 2022-11-23T03:12:18.7282783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7282934Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7283285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7283383Z fn() 2022-11-23T03:12:18.7283735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7283856Z output = model(*input) 2022-11-23T03:12:18.7284220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7284342Z test(self, **param_kwargs) 2022-11-23T03:12:18.7284664Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7284902Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7285163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7285285Z return func(*args, **kwargs) 2022-11-23T03:12:18.7285643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7285810Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7286063Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7286247Z self.run_subtests( 2022-11-23T03:12:18.7286550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7286670Z _lazy_init(state, module) 2022-11-23T03:12:18.7287018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7287177Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7287510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7287649Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7288009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7288160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7288493Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7288664Z return func(*args, **kwargs) 2022-11-23T03:12:18.7289047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7289164Z output = model(*input) 2022-11-23T03:12:18.7289522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7289624Z p_assert( 2022-11-23T03:12:18.7289948Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7290086Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7290415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7290538Z traceback.print_stack() 2022-11-23T03:12:18.7290908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7291134Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7291482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7291600Z _lazy_init(state, module) 2022-11-23T03:12:18.7291950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7292090Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7292432Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7292545Z return func(*args, **kwargs) 2022-11-23T03:12:18.7292916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7293073Z p_assert( 2022-11-23T03:12:18.7293336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7293471Z traceback.print_stack() 2022-11-23T03:12:18.7293598Z File "", line 1, in 2022-11-23T03:12:18.7293807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7293947Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7294144Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7294293Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7294487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7294589Z self.run() 2022-11-23T03:12:18.7294788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7294932Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7295268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7295402Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7295769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7295885Z getattr(self, test_name)() 2022-11-23T03:12:18.7296224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7296320Z fn() 2022-11-23T03:12:18.7296684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7296806Z test(self, **param_kwargs) 2022-11-23T03:12:18.7297157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7297280Z return func(*args, **kwargs) 2022-11-23T03:12:18.7297531Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7297648Z self.run_subtests( 2022-11-23T03:12:18.7298114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7298199Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7298566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7298721Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7299097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7299218Z output = model(*input) 2022-11-23T03:12:18.7299544Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7299687Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7300044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7300274Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7300646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7300772Z _lazy_init(state, module) 2022-11-23T03:12:18.7301123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7301270Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7301606Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7301733Z return func(*args, **kwargs) 2022-11-23T03:12:18.7302092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7302196Z p_assert( 2022-11-23T03:12:18.7302538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7302668Z traceback.print_stack() 2022-11-23T03:12:18.7302911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7303149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7303378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7303612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7303724Z File "", line 1, in 2022-11-23T03:12:18.7304136Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7304293Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7304593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7304746Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7304978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7305091Z self.run() 2022-11-23T03:12:18.7305266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7305392Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7305766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7305907Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7306263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7306377Z getattr(self, test_name)() 2022-11-23T03:12:18.7306740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7306861Z fn() 2022-11-23T03:12:18.7307200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7307411Z test(self, **param_kwargs) 2022-11-23T03:12:18.7307690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7307815Z return func(*args, **kwargs) 2022-11-23T03:12:18.7308153Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7308189Z self.run_subtests( 2022-11-23T03:12:18.7308540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7308704Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7309045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7309201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7309651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7309770Z output = model(*input) 2022-11-23T03:12:18.7310097Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7310337Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7310616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7310793Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7311157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7311259Z _lazy_init(state, module) 2022-11-23T03:12:18.7311606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7311751Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7312093Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7312219Z return func(*args, **kwargs) 2022-11-23T03:12:18.7312598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7312704Z p_assert( 2022-11-23T03:12:18.7313040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7313147Z traceback.print_stack() 2022-11-23T03:12:18.7313277Z File "", line 1, in 2022-11-23T03:12:18.7313485Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7313628Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7313834Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7313991Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7314207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7314292Z self.run() 2022-11-23T03:12:18.7314497Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7314681Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7314988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7315122Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7315536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7315616Z getattr(self, test_name)() 2022-11-23T03:12:18.7315973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7316054Z fn() 2022-11-23T03:12:18.7316467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7316598Z test(self, **param_kwargs) 2022-11-23T03:12:18.7316958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7317085Z return func(*args, **kwargs) 2022-11-23T03:12:18.7317341Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7317460Z self.run_subtests( 2022-11-23T03:12:18.7317856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7317957Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7318317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7318466Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7318911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7319030Z output = model(*input) 2022-11-23T03:12:18.7319355Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7319496Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7319870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7320028Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7320394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7320518Z _lazy_init(state, module) 2022-11-23T03:12:18.7320868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7321016Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7321353Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7321480Z return func(*args, **kwargs) 2022-11-23T03:12:18.7321857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7321939Z p_assert( 2022-11-23T03:12:18.7322273Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7322401Z traceback.print_stack() 2022-11-23T03:12:18.7322530Z File "", line 1, in 2022-11-23T03:12:18.7322744Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7322888Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7323095Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7323246Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7323451Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7323556Z self.run() 2022-11-23T03:12:18.7323763Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7323910Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7324254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7324389Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7324752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7324856Z getattr(self, test_name)() 2022-11-23T03:12:18.7325218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7325319Z fn() 2022-11-23T03:12:18.7325737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7325868Z test(self, **param_kwargs) 2022-11-23T03:12:18.7326230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7326355Z return func(*args, **kwargs) 2022-11-23T03:12:18.7326615Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7326710Z self.run_subtests( 2022-11-23T03:12:18.7327065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7327229Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7327591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7327795Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7328173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7328338Z output = model(*input) 2022-11-23T03:12:18.7328622Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7328745Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7329120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7329296Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7329666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7329790Z _lazy_init(state, module) 2022-11-23T03:12:18.7330139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7330290Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7330699Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7330737Z return func(*args, **kwargs) 2022-11-23T03:12:18.7331118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7331223Z p_assert( 2022-11-23T03:12:18.7331567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7331697Z traceback.print_stack() 2022-11-23T03:12:18.7331830Z File "", line 1, in 2022-11-23T03:12:18.7332037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7332179Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7332362Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7332524Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7332738Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7332843Z self.run() 2022-11-23T03:12:18.7333047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7333194Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7333536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7333654Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7334018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7334141Z getattr(self, test_name)() 2022-11-23T03:12:18.7334498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7334604Z fn() 2022-11-23T03:12:18.7335018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7335148Z test(self, **param_kwargs) 2022-11-23T03:12:18.7335507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7335687Z return func(*args, **kwargs) 2022-11-23T03:12:18.7335874Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7335991Z self.run_subtests( 2022-11-23T03:12:18.7336416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7336509Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7336872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7337076Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7337461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7337564Z output = model(*input) 2022-11-23T03:12:18.7337889Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7338035Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7338411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7338588Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7338950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7339076Z _lazy_init(state, module) 2022-11-23T03:12:18.7339426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7339558Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7339903Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7340030Z return func(*args, **kwargs) 2022-11-23T03:12:18.7340408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7340511Z p_assert( 2022-11-23T03:12:18.7340847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7340975Z traceback.print_stack() 2022-11-23T03:12:18.7341213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7341428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7341720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7341907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7342041Z File "", line 1, in 2022-11-23T03:12:18.7342254Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7342398Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7342601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7342752Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7343077Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7343111Z self.run() 2022-11-23T03:12:18.7343317Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7343465Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7343807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7344301Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7344774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7344902Z getattr(self, test_name)() 2022-11-23T03:12:18.7345256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7345366Z fn() 2022-11-23T03:12:18.7345732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7345849Z test(self, **param_kwargs) 2022-11-23T03:12:18.7346209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7346262Z return func(*args, **kwargs) 2022-11-23T03:12:18.7346502Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7346682Z self.run_subtests( 2022-11-23T03:12:18.7347024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7347188Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7347553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7347710Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7348083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7348203Z output = model(*input) 2022-11-23T03:12:18.7348532Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7348673Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7349030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7349217Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7349586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7349709Z _lazy_init(state, module) 2022-11-23T03:12:18.7350064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7350213Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7350551Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7350677Z return func(*args, **kwargs) 2022-11-23T03:12:18.7351036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7351139Z p_assert( 2022-11-23T03:12:18.7351484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7351612Z traceback.print_stack() 2022-11-23T03:12:18.7351743Z File "", line 1, in 2022-11-23T03:12:18.7351953Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7352097Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7352281Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7352524Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7352652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7352758Z self.run() 2022-11-23T03:12:18.7352962Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7353115Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7353457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7353642Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7353994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7354119Z getattr(self, test_name)() 2022-11-23T03:12:18.7354523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7354578Z fn() 2022-11-23T03:12:18.7354943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7355069Z test(self, **param_kwargs) 2022-11-23T03:12:18.7355424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7355547Z return func(*args, **kwargs) 2022-11-23T03:12:18.7355787Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7355974Z self.run_subtests( 2022-11-23T03:12:18.7356333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7356499Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7356863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7357017Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7357155Z File "", line 1, in 2022-11-23T03:12:18.7357530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7357631Z output = model(*input) 2022-11-23T03:12:18.7357954Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7358103Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7358317Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7358535Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7358838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7359061Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7359227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7359362Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7359731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7359855Z _lazy_init(state, module) 2022-11-23T03:12:18.7360069Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7360176Z self.run() 2022-11-23T03:12:18.7360532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7360677Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7360861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7361011Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7361351Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7361481Z return func(*args, **kwargs) 2022-11-23T03:12:18.7361817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7361953Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7362328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7362432Z p_assert( 2022-11-23T03:12:18.7362821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7362955Z getattr(self, test_name)() 2022-11-23T03:12:18.7363297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7363429Z traceback.print_stack() 2022-11-23T03:12:18.7363792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7363891Z fn() 2022-11-23T03:12:18.7364252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7364376Z test(self, **param_kwargs) 2022-11-23T03:12:18.7364712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7364839Z return func(*args, **kwargs) 2022-11-23T03:12:18.7365099Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7365272Z self.run_subtests( 2022-11-23T03:12:18.7365628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7365795Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7366156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7366309Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7366665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7366785Z output = model(*input) 2022-11-23T03:12:18.7367114Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7367256Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7367639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7367816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7368183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7368307Z _lazy_init(state, module) 2022-11-23T03:12:18.7368638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7368785Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7369123Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7369249Z return func(*args, **kwargs) 2022-11-23T03:12:18.7369627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7369734Z p_assert( 2022-11-23T03:12:18.7370076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7370206Z traceback.print_stack() 2022-11-23T03:12:18.7370317Z File "", line 1, in 2022-11-23T03:12:18.7370527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7370674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7370878Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7371082Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7371299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7371403Z self.run() 2022-11-23T03:12:18.7371612Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7371740Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7372136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7372277Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7372643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7372768Z getattr(self, test_name)() 2022-11-23T03:12:18.7373124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7373309Z fn() 2022-11-23T03:12:18.7373568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7373690Z test(self, **param_kwargs) 2022-11-23T03:12:18.7374049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7374176Z return func(*args, **kwargs) 2022-11-23T03:12:18.7374493Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7374611Z self.run_subtests( 2022-11-23T03:12:18.7374967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7375131Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7375474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7375734Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7376009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7376221Z output = model(*input) 2022-11-23T03:12:18.7376460Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7376603Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7376983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7377164Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7377530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7377633Z _lazy_init(state, module) 2022-11-23T03:12:18.7377982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7378127Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7378466Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7378591Z return func(*args, **kwargs) 2022-11-23T03:12:18.7378968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7379127Z p_assert( 2022-11-23T03:12:18.7379488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7379532Z traceback.print_stack() 2022-11-23T03:12:18.7379773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7380010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7380245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7380547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7380614Z File "", line 1, in 2022-11-23T03:12:18.7380824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7380947Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7381150Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7381356Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7381580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7381689Z self.run() 2022-11-23T03:12:18.7381894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7382041Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7382386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7382501Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7382862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7382992Z getattr(self, test_name)() 2022-11-23T03:12:18.7383354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7383520Z fn() 2022-11-23T03:12:18.7384100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7384237Z test(self, **param_kwargs) 2022-11-23T03:12:18.7384687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7384813Z return func(*args, **kwargs) 2022-11-23T03:12:18.7385075Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7385185Z self.run_subtests( 2022-11-23T03:12:18.7385532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7385711Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7386015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7386139Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7386518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7386621Z output = model(*input) 2022-11-23T03:12:18.7386952Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7387096Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7387475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7387653Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7388020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7388143Z _lazy_init(state, module) 2022-11-23T03:12:18.7388493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7388625Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7388965Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7389095Z return func(*args, **kwargs) 2022-11-23T03:12:18.7389476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7389580Z p_assert( 2022-11-23T03:12:18.7389918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7390048Z traceback.print_stack() 2022-11-23T03:12:18.7390177Z File "", line 1, in 2022-11-23T03:12:18.7390367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7390511Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7390723Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7390948Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7391175Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7391283Z self.run() 2022-11-23T03:12:18.7391486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7391614Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7392035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7392090Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7392452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7392580Z getattr(self, test_name)() 2022-11-23T03:12:18.7392935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7393099Z fn() 2022-11-23T03:12:18.7393473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7393579Z test(self, **param_kwargs) 2022-11-23T03:12:18.7393935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7394063Z return func(*args, **kwargs) 2022-11-23T03:12:18.7394322Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7394439Z self.run_subtests( 2022-11-23T03:12:18.7394795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7394963Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7395326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7395466Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7395843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7395964Z output = model(*input) 2022-11-23T03:12:18.7396292Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7396436Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7396813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7396994Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7397360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7397462Z _lazy_init(state, module) 2022-11-23T03:12:18.7397819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7397966Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7398304Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7398529Z return func(*args, **kwargs) 2022-11-23T03:12:18.7398810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7398916Z p_assert( 2022-11-23T03:12:18.7399252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7399360Z traceback.print_stack() 2022-11-23T03:12:18.7399492Z File "", line 1, in 2022-11-23T03:12:18.7399702Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7399843Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7400054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7400262Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7400486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7400593Z self.run() 2022-11-23T03:12:18.7400778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7400925Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7401266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7401400Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7401765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7401890Z getattr(self, test_name)() 2022-11-23T03:12:18.7402249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7402377Z fn() 2022-11-23T03:12:18.7402749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7402876Z test(self, **param_kwargs) 2022-11-23T03:12:18.7403234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7403363Z return func(*args, **kwargs) 2022-11-23T03:12:18.7403621Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7403735Z self.run_subtests( 2022-11-23T03:12:18.7404088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7404232Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7404596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7404758Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7405134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7405258Z output = model(*input) 2022-11-23T03:12:18.7405585Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7405730Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7406105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7406262Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7406630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7406755Z _lazy_init(state, module) 2022-11-23T03:12:18.7407115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7407261Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7407597Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7407726Z return func(*args, **kwargs) 2022-11-23T03:12:18.7408104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7408188Z p_assert( 2022-11-23T03:12:18.7408525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7408650Z traceback.print_stack() 2022-11-23T03:12:18.7408782Z File "", line 1, in 2022-11-23T03:12:18.7408991Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7409135Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7409392Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7409554Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7409748Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7409853Z self.run() 2022-11-23T03:12:18.7410060Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7410207Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7410550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7410693Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7411053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7411183Z getattr(self, test_name)() 2022-11-23T03:12:18.7411519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7411668Z fn() 2022-11-23T03:12:18.7412044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7412169Z test(self, **param_kwargs) 2022-11-23T03:12:18.7412526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7412655Z return func(*args, **kwargs) 2022-11-23T03:12:18.7412914Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7413009Z self.run_subtests( 2022-11-23T03:12:18.7413361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7413524Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7413885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7414047Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7414424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7414547Z output = model(*input) 2022-11-23T03:12:18.7414874Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7414995Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7415374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7415552Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7415918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7416041Z _lazy_init(state, module) 2022-11-23T03:12:18.7416404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7416549Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7416887Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7417014Z return func(*args, **kwargs) 2022-11-23T03:12:18.7417369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7417472Z p_assert( 2022-11-23T03:12:18.7417811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7417941Z traceback.print_stack() 2022-11-23T03:12:18.7418181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7418418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7418703Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7418947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7419058Z File "", line 1, in 2022-11-23T03:12:18.7419271Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7419418Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7419620Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7419772Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7419990Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7420099Z self.run() 2022-11-23T03:12:18.7420284Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7420432Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7420826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7420960Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7421322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7421446Z getattr(self, test_name)() 2022-11-23T03:12:18.7421803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7421902Z fn() 2022-11-23T03:12:18.7422245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7422371Z test(self, **param_kwargs) 2022-11-23T03:12:18.7422727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7422852Z return func(*args, **kwargs) 2022-11-23T03:12:18.7423117Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7423231Z self.run_subtests( 2022-11-23T03:12:18.7423669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7423762Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7424456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7424609Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7424997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7425113Z output = model(*input) 2022-11-23T03:12:18.7425450Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7425590Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7425891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7426060Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7426458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7426581Z _lazy_init(state, module) 2022-11-23T03:12:18.7426937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7427081Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7427417Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7427544Z return func(*args, **kwargs) 2022-11-23T03:12:18.7427921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7428027Z p_assert( 2022-11-23T03:12:18.7428412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7428553Z traceback.print_stack() 2022-11-23T03:12:18.7428685Z File "", line 1, in 2022-11-23T03:12:18.7428893Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7429038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7429239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7429393Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7429586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7429691Z self.run() 2022-11-23T03:12:18.7429895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7430043Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7430455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7430593Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7430956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7431081Z getattr(self, test_name)() 2022-11-23T03:12:18.7431418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7431521Z fn() 2022-11-23T03:12:18.7431887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7432009Z test(self, **param_kwargs) 2022-11-23T03:12:18.7432363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7432488Z return func(*args, **kwargs) 2022-11-23T03:12:18.7432752Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7432866Z self.run_subtests( 2022-11-23T03:12:18.7433198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7433363Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7433726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7433880Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7434257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7434379Z output = model(*input) 2022-11-23T03:12:18.7434707Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7434857Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7435219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7435399Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7435771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7435894Z _lazy_init(state, module) 2022-11-23T03:12:18.7436248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7436392Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7436731Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7436860Z return func(*args, **kwargs) 2022-11-23T03:12:18.7437217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7437326Z p_assert( 2022-11-23T03:12:18.7437709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7437846Z traceback.print_stack() 2022-11-23T03:12:18.7437978Z File "", line 1, in 2022-11-23T03:12:18.7438191Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7438334Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7438537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7438669Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7438886Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7438991Z self.run() 2022-11-23T03:12:18.7439195Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7439389Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7439736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7439871Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7440211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7440336Z getattr(self, test_name)() 2022-11-23T03:12:18.7440697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7440799Z fn() 2022-11-23T03:12:18.7441164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7441288Z test(self, **param_kwargs) 2022-11-23T03:12:18.7441642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7441772Z return func(*args, **kwargs) 2022-11-23T03:12:18.7442014Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7442131Z self.run_subtests( 2022-11-23T03:12:18.7442485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7442650Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7443068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7443223Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7443601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7443723Z output = model(*input) 2022-11-23T03:12:18.7444031Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7444179Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7444556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7444736Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7445106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7445228Z _lazy_init(state, module) 2022-11-23T03:12:18.7445579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7445723Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7446061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7446167Z return func(*args, **kwargs) 2022-11-23T03:12:18.7446549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7446707Z p_assert( 2022-11-23T03:12:18.7447056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7447186Z traceback.print_stack() 2022-11-23T03:12:18.7447317Z File "", line 1, in 2022-11-23T03:12:18.7447529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7447653Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7447858Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7448010Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7448227Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7448332Z self.run() 2022-11-23T03:12:18.7448542Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7448756Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7449105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7449221Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7449582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7449708Z getattr(self, test_name)() 2022-11-23T03:12:18.7450067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7450168Z fn() 2022-11-23T03:12:18.7450533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7450657Z test(self, **param_kwargs) 2022-11-23T03:12:18.7451012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7451124Z return func(*args, **kwargs) 2022-11-23T03:12:18.7451387Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7451500Z self.run_subtests( 2022-11-23T03:12:18.7451851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7452020Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7452381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7452535Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7452910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7453011Z output = model(*input) 2022-11-23T03:12:18.7453336Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7453482Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7453859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7454037Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7454404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7454528Z _lazy_init(state, module) 2022-11-23T03:12:18.7454971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7455024Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7455358Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7455486Z return func(*args, **kwargs) 2022-11-23T03:12:18.7455866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7456030Z p_assert( 2022-11-23T03:12:18.7456381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7456510Z traceback.print_stack() 2022-11-23T03:12:18.7456750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7456968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7457203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7457438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7457667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7457894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7458177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7458408Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7458633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7458857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7459070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7459398Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7459526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7459752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7459976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7460209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7460437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7460643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7460871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7461097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7461326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7461554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7461779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7462007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7462239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7462468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7462673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7462899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7463123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7463347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7463570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7463794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7464327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7464660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7464865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7465105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7465345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7465565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7465795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7466007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7466262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7466530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7466735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7466940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7467709Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7468455Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7469203Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7469944Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7470682Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7471464Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7472202Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7472939Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7473721Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7474468Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7475200Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7475981Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7476715Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7477443Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7478175Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7478911Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7479641Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7480375Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7481110Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7481840Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7482122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7482368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7482600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7482834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7483065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7483277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7483507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7483736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7483967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7484299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7484528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7484754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7484979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7485208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7485413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7485639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7485871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7486094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7486327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7486555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7486785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7487011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7487311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7487442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7487665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7487890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7488118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7488350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7488643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7488869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7489071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7489296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7489522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7489748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7489972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7490248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7490564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7490709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7490932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7491138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7491363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7491586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7491809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7492031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7492306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7492532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7492756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7492960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7493709Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7494453Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7495192Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7495928Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7496667Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7497399Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7498133Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7498905Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7499649Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7500383Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7501118Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7501900Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7502629Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7503361Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7504339Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7505178Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7505916Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7506555Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7507277Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7508076Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.7508334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7508573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7508804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7509036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7509152Z dist init r=3, world=4 2022-11-23T03:12:18.7509486Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7509788Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7510175Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7510486Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7510791Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7511154Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7511404Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7511711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7512016Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7512317Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7512622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7512921Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.7513015Z dist init r=1, world=4 2022-11-23T03:12:18.7513373Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7513693Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7514005Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7514309Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7514611Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7514913Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7515262Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7515570Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7515890Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7516218Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7516478Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7516809Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.7516924Z dist init r=0, world=4 2022-11-23T03:12:18.7517247Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7517567Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7517877Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7518179Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7518493Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7518797Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7519102Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7519404Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7519709Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7519993Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7520301Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7520609Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.7520764Z dist init r=2, world=4 2022-11-23T03:12:18.7521107Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7521428Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7521738Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7522094Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7522404Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7522707Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7523011Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7523310Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7523639Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7523956Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7524246Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7524547Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.7524652Z ok (6.924s) 2022-11-23T03:12:18.7525002Z test_nested_always_wrap_model_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23764 2022-11-23T03:12:18.7525233Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23765 2022-11-23T03:12:18.7525457Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23766 2022-11-23T03:12:18.7525672Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23767 2022-11-23T03:12:18.7526038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.7526217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.7526597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.7526793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.7527164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.7527348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.7527729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.7527921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.7528290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.7528445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.7528881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.7529011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.7529379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.7529556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.7529992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.7530196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.7530449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.7530695Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.7530921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.7531162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.7531644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.7531967Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.7532413Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.7532806Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.7533040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.7533267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.7533494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.7533699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.7533937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7534174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7534416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7534701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7535674Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.7535913Z warnings.warn( 2022-11-23T03:12:18.7536931Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.7537048Z warnings.warn( 2022-11-23T03:12:18.7538058Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.7538200Z warnings.warn( 2022-11-23T03:12:18.7539250Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.7539373Z warnings.warn( 2022-11-23T03:12:18.7539487Z File "", line 1, in 2022-11-23T03:12:18.7539702Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7539850Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7540057Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7540210Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7540426Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7540535Z self.run() 2022-11-23T03:12:18.7540792Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7540926Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7541279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7541417Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7541783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7541911Z getattr(self, test_name)() 2022-11-23T03:12:18.7542273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7542376Z fn() 2022-11-23T03:12:18.7542723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7542849Z test(self, **param_kwargs) 2022-11-23T03:12:18.7543367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7543411Z return func(*args, **kwargs) 2022-11-23T03:12:18.7543671Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7543791Z self.run_subtests( 2022-11-23T03:12:18.7544537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7544710Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7545054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7545212Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7545599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7545718Z output = model(*input) 2022-11-23T03:12:18.7546049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7546156Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7546494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7546676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7547045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7547152Z _lazy_init(state, module) 2022-11-23T03:12:18.7547598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7547659Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7547999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7548125Z return func(*args, **kwargs) 2022-11-23T03:12:18.7548579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7548692Z p_assert( 2022-11-23T03:12:18.7549012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7549140Z traceback.print_stack() 2022-11-23T03:12:18.7549271Z File "", line 1, in 2022-11-23T03:12:18.7549488Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7549634Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7549839Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7549988Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7550206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7550291Z self.run() 2022-11-23T03:12:18.7550564Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7550714Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7551059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7551194Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7551554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7551679Z getattr(self, test_name)() 2022-11-23T03:12:18.7552028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7552109Z fn() 2022-11-23T03:12:18.7552476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7552600Z test(self, **param_kwargs) 2022-11-23T03:12:18.7552957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7553097Z return func(*args, **kwargs) 2022-11-23T03:12:18.7553354Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7553468Z self.run_subtests( 2022-11-23T03:12:18.7553819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7553964Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7554329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7554484Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7554859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7554980Z output = model(*input) 2022-11-23T03:12:18.7555411Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7555456Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7555836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7555994Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7556361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7556485Z _lazy_init(state, module) 2022-11-23T03:12:18.7556837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7556984Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7557322Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7557452Z return func(*args, **kwargs) 2022-11-23T03:12:18.7557879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7557970Z p_assert( 2022-11-23T03:12:18.7558315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7558444Z traceback.print_stack() 2022-11-23T03:12:18.7558574Z File "", line 1, in 2022-11-23T03:12:18.7558787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7558933Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7559137Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7559336Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7559486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7559645Z self.run() 2022-11-23T03:12:18.7559885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7560003Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7560343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7560477Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7560839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7560942Z getattr(self, test_name)() 2022-11-23T03:12:18.7561300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7561401Z fn() 2022-11-23T03:12:18.7561767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7561897Z test(self, **param_kwargs) 2022-11-23T03:12:18.7562265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7562394Z return func(*args, **kwargs) 2022-11-23T03:12:18.7562654Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7562750Z self.run_subtests( 2022-11-23T03:12:18.7563105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7563272Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7563640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7563795Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7564172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7564389Z output = model(*input) 2022-11-23T03:12:18.7564634Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7564758Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7565137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7565318Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7565687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7565809Z _lazy_init(state, module) 2022-11-23T03:12:18.7566164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7566309Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7566648Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7566758Z return func(*args, **kwargs) 2022-11-23T03:12:18.7567184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7567296Z p_assert( 2022-11-23T03:12:18.7567638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7567765Z traceback.print_stack() 2022-11-23T03:12:18.7567898Z File "", line 1, in 2022-11-23T03:12:18.7568108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7568255Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7568442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7568594Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7568814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7568982Z self.run() 2022-11-23T03:12:18.7569191Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7569340Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7569681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7569796Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7570158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7570286Z getattr(self, test_name)() 2022-11-23T03:12:18.7570645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7570745Z fn() 2022-11-23T03:12:18.7571160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7571293Z test(self, **param_kwargs) 2022-11-23T03:12:18.7571659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7571766Z return func(*args, **kwargs) 2022-11-23T03:12:18.7572026Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7572141Z self.run_subtests( 2022-11-23T03:12:18.7572495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7572658Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7573022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7573270Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7573649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7573754Z output = model(*input) 2022-11-23T03:12:18.7574084Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7574226Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7574608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7574881Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7575159Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7575285Z _lazy_init(state, module) 2022-11-23T03:12:18.7575637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7575762Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7576103Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7576233Z return func(*args, **kwargs) 2022-11-23T03:12:18.7576663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7576776Z p_assert( 2022-11-23T03:12:18.7577144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7577244Z traceback.print_stack() 2022-11-23T03:12:18.7577486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7577705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7577940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7578174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7578306Z File "", line 1, in 2022-11-23T03:12:18.7578568Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7578792Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7578925Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7579081Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7579275Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7579381Z self.run() 2022-11-23T03:12:18.7579588Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7579739Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7580084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7580268Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7580585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7580716Z getattr(self, test_name)() 2022-11-23T03:12:18.7581059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7581161Z fn() 2022-11-23T03:12:18.7581528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7581655Z test(self, **param_kwargs) 2022-11-23T03:12:18.7582011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7582138Z return func(*args, **kwargs) 2022-11-23T03:12:18.7582394Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7582488Z self.run_subtests( 2022-11-23T03:12:18.7582844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7583014Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7583380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7583536Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7584124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7584257Z output = model(*input) 2022-11-23T03:12:18.7584667Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7584813Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7585191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7585379Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7585813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7585947Z _lazy_init(state, module) 2022-11-23T03:12:18.7586309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7586463Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7586808Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7587029Z return func(*args, **kwargs) 2022-11-23T03:12:18.7587301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7587407Z p_assert( 2022-11-23T03:12:18.7587747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7587876Z traceback.print_stack() 2022-11-23T03:12:18.7588007Z File "", line 1, in 2022-11-23T03:12:18.7588288Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7588436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7588620Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7588775Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7588987Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7589094Z self.run() 2022-11-23T03:12:18.7589299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7589446Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7589789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7589925Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7590267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7590398Z getattr(self, test_name)() 2022-11-23T03:12:18.7590760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7590862Z fn() 2022-11-23T03:12:18.7591230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7591357Z test(self, **param_kwargs) 2022-11-23T03:12:18.7591716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7591844Z return func(*args, **kwargs) 2022-11-23T03:12:18.7592081Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7592199Z self.run_subtests( 2022-11-23T03:12:18.7592551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7592810Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7593092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7593249Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7593628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7593855Z output = model(*input) 2022-11-23T03:12:18.7594061Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7594203Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7594582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7594761Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7595184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7595313Z _lazy_init(state, module) 2022-11-23T03:12:18.7595668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7595812Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7596130Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7596258Z return func(*args, **kwargs) 2022-11-23T03:12:18.7596638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7596742Z p_assert( 2022-11-23T03:12:18.7597079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7597209Z traceback.print_stack() 2022-11-23T03:12:18.7597394Z File "", line 1, in 2022-11-23T03:12:18.7597613Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7597738Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7597946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7598100Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7598317Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7598424Z self.run() 2022-11-23T03:12:18.7598628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7598775Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7599098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7599234Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7599598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7599732Z getattr(self, test_name)() 2022-11-23T03:12:18.7600097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7600200Z fn() 2022-11-23T03:12:18.7600565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7600695Z test(self, **param_kwargs) 2022-11-23T03:12:18.7601029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7601157Z return func(*args, **kwargs) 2022-11-23T03:12:18.7601421Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7601536Z self.run_subtests( 2022-11-23T03:12:18.7601890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7602065Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7602435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7602593Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7602947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7603072Z output = model(*input) 2022-11-23T03:12:18.7603401Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7603546Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7603923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7604101Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7604522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7604654Z _lazy_init(state, module) 2022-11-23T03:12:18.7604990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7605139Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7605476Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7605602Z return func(*args, **kwargs) 2022-11-23T03:12:18.7605983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7606091Z p_assert( 2022-11-23T03:12:18.7606428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7606558Z traceback.print_stack() 2022-11-23T03:12:18.7606720Z File "", line 1, in 2022-11-23T03:12:18.7606938Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7607083Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7607291Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7607447Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7607665Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7607773Z self.run() 2022-11-23T03:12:18.7607957Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7608107Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7608460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7608596Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7608959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7609092Z getattr(self, test_name)() 2022-11-23T03:12:18.7609454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7609554Z fn() 2022-11-23T03:12:18.7609900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7610024Z test(self, **param_kwargs) 2022-11-23T03:12:18.7610381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7610507Z return func(*args, **kwargs) 2022-11-23T03:12:18.7610768Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7610886Z self.run_subtests( 2022-11-23T03:12:18.7611239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7611411Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7611755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7611910Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7612287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7612409Z output = model(*input) 2022-11-23T03:12:18.7612737Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7612881Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7613260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7613438Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7613835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7613969Z _lazy_init(state, module) 2022-11-23T03:12:18.7614325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7614470Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7614809Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7614935Z return func(*args, **kwargs) 2022-11-23T03:12:18.7615311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7615415Z p_assert( 2022-11-23T03:12:18.7615731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7615858Z traceback.print_stack() 2022-11-23T03:12:18.7616154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7616483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7616720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7616868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7617001Z File "", line 1, in 2022-11-23T03:12:18.7617214Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7617338Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7617538Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7617690Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7626805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7626976Z self.run() 2022-11-23T03:12:18.7627211Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7627366Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7627761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7627877Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7628250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7628380Z getattr(self, test_name)() 2022-11-23T03:12:18.7628748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7628899Z fn() 2022-11-23T03:12:18.7629228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7629356Z test(self, **param_kwargs) 2022-11-23T03:12:18.7629729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7629837Z return func(*args, **kwargs) 2022-11-23T03:12:18.7630099Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7630222Z self.run_subtests( 2022-11-23T03:12:18.7630585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7630757Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7631126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7631285Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7631665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7631772Z output = model(*input) 2022-11-23T03:12:18.7632293Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7632461Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7632849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7633032Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7633408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7633535Z _lazy_init(state, module) 2022-11-23T03:12:18.7633895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7634021Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7634363Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7634571Z return func(*args, **kwargs) 2022-11-23T03:12:18.7634965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7635078Z p_assert( 2022-11-23T03:12:18.7635421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7635555Z traceback.print_stack() 2022-11-23T03:12:18.7635693Z File "", line 1, in 2022-11-23T03:12:18.7635884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7636035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7636243Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7636402Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7636703Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7636828Z self.run() 2022-11-23T03:12:18.7636951Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7637080Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7637429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7637568Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7637938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7638068Z getattr(self, test_name)() 2022-11-23T03:12:18.7638434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7638538Z fn() 2022-11-23T03:12:18.7638911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7639022Z test(self, **param_kwargs) 2022-11-23T03:12:18.7639387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7639517Z return func(*args, **kwargs) 2022-11-23T03:12:18.7639780Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7639900Z self.run_subtests( 2022-11-23T03:12:18.7640257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7640425Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7640796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7640931Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7641314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7641441Z output = model(*input) 2022-11-23T03:12:18.7641837Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7642026Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7642374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7642559Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7642928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7643094Z _lazy_init(state, module) 2022-11-23T03:12:18.7643455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7643607Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7643947Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7644152Z return func(*args, **kwargs) 2022-11-23T03:12:18.7644655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7644688Z p_assert( 2022-11-23T03:12:18.7645028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7645136Z traceback.print_stack() 2022-11-23T03:12:18.7645273Z File "", line 1, in 2022-11-23T03:12:18.7645514Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7645679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7645874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7646031Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7646253Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7646368Z self.run() 2022-11-23T03:12:18.7646555Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7646706Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7647051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7647192Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7647562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7647693Z getattr(self, test_name)() 2022-11-23T03:12:18.7648054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7648161Z fn() 2022-11-23T03:12:18.7648507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7648640Z test(self, **param_kwargs) 2022-11-23T03:12:18.7649004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7649134Z return func(*args, **kwargs) 2022-11-23T03:12:18.7649397Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7649520Z self.run_subtests( 2022-11-23T03:12:18.7649881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7650027Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7650396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7650553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7650931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7651064Z output = model(*input) 2022-11-23T03:12:18.7651446Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7651598Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7651983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7652166Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7652515Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7652641Z _lazy_init(state, module) 2022-11-23T03:12:18.7652996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7653145Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7653487Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7653673Z return func(*args, **kwargs) 2022-11-23T03:12:18.7654062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7654172Z p_assert( 2022-11-23T03:12:18.7654488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7654621Z traceback.print_stack() 2022-11-23T03:12:18.7654756Z File "", line 1, in 2022-11-23T03:12:18.7654970Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7655118Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7655326Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7655585Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7655682Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7655799Z self.run() 2022-11-23T03:12:18.7656005Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7656158Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7656500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7656637Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7657001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7657132Z getattr(self, test_name)() 2022-11-23T03:12:18.7657470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7657574Z fn() 2022-11-23T03:12:18.7657941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7658073Z test(self, **param_kwargs) 2022-11-23T03:12:18.7658436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7658567Z return func(*args, **kwargs) 2022-11-23T03:12:18.7658829Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7658949Z self.run_subtests( 2022-11-23T03:12:18.7659286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7659453Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7659824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7659981Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7660418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7660542Z output = model(*input) 2022-11-23T03:12:18.7660883Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7661031Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7661388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7661569Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7661940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7662070Z _lazy_init(state, module) 2022-11-23T03:12:18.7662423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7662574Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7662979Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7663108Z return func(*args, **kwargs) 2022-11-23T03:12:18.7663467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7663575Z p_assert( 2022-11-23T03:12:18.7664154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7664299Z traceback.print_stack() 2022-11-23T03:12:18.7664649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7664875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7665131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7665355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7665475Z File "", line 1, in 2022-11-23T03:12:18.7665704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7665851Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7665966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7666122Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7666341Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7666454Z self.run() 2022-11-23T03:12:18.7666638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7666792Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7667148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7667287Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7667749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7667789Z getattr(self, test_name)() 2022-11-23T03:12:18.7668153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7668256Z fn() 2022-11-23T03:12:18.7668601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7668730Z test(self, **param_kwargs) 2022-11-23T03:12:18.7669093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7669222Z return func(*args, **kwargs) 2022-11-23T03:12:18.7669484Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7669605Z self.run_subtests( 2022-11-23T03:12:18.7669960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7670215Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7670576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7670735Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7671233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7671367Z output = model(*input) 2022-11-23T03:12:18.7671701Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7671849Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7672229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7672409Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7672833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7672962Z _lazy_init(state, module) 2022-11-23T03:12:18.7673317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7673467Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7673808Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7673938Z return func(*args, **kwargs) 2022-11-23T03:12:18.7674320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7674430Z p_assert( 2022-11-23T03:12:18.7674747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7674887Z traceback.print_stack() 2022-11-23T03:12:18.7675025Z File "", line 1, in 2022-11-23T03:12:18.7675243Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7675385Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7675592Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7675745Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7675960Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7676047Z self.run() 2022-11-23T03:12:18.7676255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7676487Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7676753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7676894Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7677267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7677398Z getattr(self, test_name)() 2022-11-23T03:12:18.7677737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7677842Z fn() 2022-11-23T03:12:18.7678207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7678337Z test(self, **param_kwargs) 2022-11-23T03:12:18.7678696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7678827Z return func(*args, **kwargs) 2022-11-23T03:12:18.7679087Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7679204Z self.run_subtests( 2022-11-23T03:12:18.7679589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7679770Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7680142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7680305Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7680689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7680891Z output = model(*input) 2022-11-23T03:12:18.7681149Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7681386Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7681654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7681886Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7682260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7682415Z _lazy_init(state, module) 2022-11-23T03:12:18.7682769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7682916Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7683255Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7683385Z return func(*args, **kwargs) 2022-11-23T03:12:18.7683763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7683849Z p_assert( 2022-11-23T03:12:18.7684188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7684325Z traceback.print_stack() 2022-11-23T03:12:18.7684463Z File "", line 1, in 2022-11-23T03:12:18.7684682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7684830Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7685038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7685171Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7685388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7685499Z self.run() 2022-11-23T03:12:18.7685707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7685859Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7686199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7686336Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7686707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7686896Z getattr(self, test_name)() 2022-11-23T03:12:18.7687257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7687356Z fn() 2022-11-23T03:12:18.7687723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7687907Z test(self, **param_kwargs) 2022-11-23T03:12:18.7688203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7688328Z return func(*args, **kwargs) 2022-11-23T03:12:18.7688589Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7688683Z self.run_subtests( 2022-11-23T03:12:18.7689091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7689265Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7689633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7689793Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7690164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7690290Z output = model(*input) 2022-11-23T03:12:18.7690617Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7690741Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7691117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7691356Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7691732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7691859Z _lazy_init(state, module) 2022-11-23T03:12:18.7692213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7692363Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7692702Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7692808Z return func(*args, **kwargs) 2022-11-23T03:12:18.7693194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7693304Z p_assert( 2022-11-23T03:12:18.7693645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7693782Z traceback.print_stack() 2022-11-23T03:12:18.7693960Z File "", line 1, in 2022-11-23T03:12:18.7694234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7694258Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7694513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7694615Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7694828Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7694939Z self.run() 2022-11-23T03:12:18.7695148Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7695297Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7695642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7695760Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7696132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7696259Z getattr(self, test_name)() 2022-11-23T03:12:18.7696623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7696727Z fn() 2022-11-23T03:12:18.7697092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7697215Z test(self, **param_kwargs) 2022-11-23T03:12:18.7697574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7697680Z return func(*args, **kwargs) 2022-11-23T03:12:18.7697937Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7698055Z self.run_subtests( 2022-11-23T03:12:18.7698457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7698626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7698994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7699149Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7699528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7699629Z output = model(*input) 2022-11-23T03:12:18.7699956Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7700104Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7700485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7700717Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7701090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7701218Z _lazy_init(state, module) 2022-11-23T03:12:18.7701565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7701690Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7702029Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7702160Z return func(*args, **kwargs) 2022-11-23T03:12:18.7702540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7702648Z p_assert( 2022-11-23T03:12:18.7702991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7703127Z traceback.print_stack() 2022-11-23T03:12:18.7703369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7703586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7703816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7704323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7704446Z File "", line 1, in 2022-11-23T03:12:18.7704682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7704848Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7705060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7705225Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7705343Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7705456Z self.run() 2022-11-23T03:12:18.7705663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7705816Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7706168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7706305Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7706671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7706777Z getattr(self, test_name)() 2022-11-23T03:12:18.7707142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7707246Z fn() 2022-11-23T03:12:18.7707614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7707748Z test(self, **param_kwargs) 2022-11-23T03:12:18.7708176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7708315Z return func(*args, **kwargs) 2022-11-23T03:12:18.7708578Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7708672Z self.run_subtests( 2022-11-23T03:12:18.7709030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7709196Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7709563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7709720Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7710094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7710304Z output = model(*input) 2022-11-23T03:12:18.7710643Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7710766Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7711144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7711327Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7711696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7711824Z _lazy_init(state, module) 2022-11-23T03:12:18.7712180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7712330Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7712680Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7712787Z return func(*args, **kwargs) 2022-11-23T03:12:18.7713172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7713282Z p_assert( 2022-11-23T03:12:18.7713626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7713757Z traceback.print_stack() 2022-11-23T03:12:18.7713897Z File "", line 1, in 2022-11-23T03:12:18.7714108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7714255Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7714438Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7714592Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7714817Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7714926Z self.run() 2022-11-23T03:12:18.7715137Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7715288Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7715632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7715772Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7716112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7716241Z getattr(self, test_name)() 2022-11-23T03:12:18.7716702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7716731Z fn() 2022-11-23T03:12:18.7717078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7717263Z test(self, **param_kwargs) 2022-11-23T03:12:18.7717634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7717740Z return func(*args, **kwargs) 2022-11-23T03:12:18.7718003Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7718217Z self.run_subtests( 2022-11-23T03:12:18.7718476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7718640Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7719009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7719166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7719622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7719746Z output = model(*input) 2022-11-23T03:12:18.7720103Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7720248Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7720630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7720811Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7721175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7721300Z _lazy_init(state, module) 2022-11-23T03:12:18.7721653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7721803Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7722130Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7722261Z return func(*args, **kwargs) 2022-11-23T03:12:18.7722645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7722755Z p_assert( 2022-11-23T03:12:18.7723095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7723225Z traceback.print_stack() 2022-11-23T03:12:18.7723361Z File "", line 1, in 2022-11-23T03:12:18.7723552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7723701Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7723909Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7724067Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7724289Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7724398Z self.run() 2022-11-23T03:12:18.7724602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7724753Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7725131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7725213Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7725573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7725695Z getattr(self, test_name)() 2022-11-23T03:12:18.7726054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7726156Z fn() 2022-11-23T03:12:18.7726524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7726707Z test(self, **param_kwargs) 2022-11-23T03:12:18.7727052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7727183Z return func(*args, **kwargs) 2022-11-23T03:12:18.7727446Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7727565Z self.run_subtests( 2022-11-23T03:12:18.7727922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7728092Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7728459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7728615Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7729024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7729148Z output = model(*input) 2022-11-23T03:12:18.7729480Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7729625Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7729996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7730172Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7730537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7730660Z _lazy_init(state, module) 2022-11-23T03:12:18.7730990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7731139Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7731478Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7731603Z return func(*args, **kwargs) 2022-11-23T03:12:18.7731986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7732094Z p_assert( 2022-11-23T03:12:18.7732427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7732581Z traceback.print_stack() 2022-11-23T03:12:18.7732671Z File "", line 1, in 2022-11-23T03:12:18.7732886Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7733033Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7733240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7733397Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7733615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7733724Z self.run() 2022-11-23T03:12:18.7733907Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7734055Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7734399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7734537Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7734897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7735020Z getattr(self, test_name)() 2022-11-23T03:12:18.7735374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7735472Z fn() 2022-11-23T03:12:18.7735817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7735995Z test(self, **param_kwargs) 2022-11-23T03:12:18.7736366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7736526Z return func(*args, **kwargs) 2022-11-23T03:12:18.7736765Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7736884Z self.run_subtests( 2022-11-23T03:12:18.7737244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7737413Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7737755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7737912Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7738350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7738474Z output = model(*input) 2022-11-23T03:12:18.7738806Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7738952Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7739327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7739507Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7739853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7739980Z _lazy_init(state, module) 2022-11-23T03:12:18.7740334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7740487Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7740831Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7740962Z return func(*args, **kwargs) 2022-11-23T03:12:18.7741344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7741453Z p_assert( 2022-11-23T03:12:18.7741769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7741900Z traceback.print_stack() 2022-11-23T03:12:18.7742145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7742385Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7742620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7742860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7742996Z File "", line 1, in 2022-11-23T03:12:18.7743267Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7743394Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7743602Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7743762Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7744261Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7744386Z self.run() 2022-11-23T03:12:18.7744673Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7744829Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7745173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7745316Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7745762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7745905Z getattr(self, test_name)() 2022-11-23T03:12:18.7746284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7746386Z fn() 2022-11-23T03:12:18.7746660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7746789Z test(self, **param_kwargs) 2022-11-23T03:12:18.7747127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7747258Z return func(*args, **kwargs) 2022-11-23T03:12:18.7747518Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7747702Z self.run_subtests( 2022-11-23T03:12:18.7748065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7748233Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7748601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7748763Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7749118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7749241Z output = model(*input) 2022-11-23T03:12:18.7749570Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7749719Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7750095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7750281Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7750655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7750775Z _lazy_init(state, module) 2022-11-23T03:12:18.7751123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7751250Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7751587Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7751710Z return func(*args, **kwargs) 2022-11-23T03:12:18.7752091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7752201Z p_assert( 2022-11-23T03:12:18.7752536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7752671Z traceback.print_stack() 2022-11-23T03:12:18.7752782Z File "", line 1, in 2022-11-23T03:12:18.7752995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7753143Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7753346Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7753501Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7753716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7753826Z self.run() 2022-11-23T03:12:18.7754032Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7754160Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7754507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7754652Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7755068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7755203Z getattr(self, test_name)() 2022-11-23T03:12:18.7755567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7755672Z fn() 2022-11-23T03:12:18.7756043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7756148Z test(self, **param_kwargs) 2022-11-23T03:12:18.7756507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7756634Z return func(*args, **kwargs) 2022-11-23T03:12:18.7756894Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7757108Z self.run_subtests( 2022-11-23T03:12:18.7757429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7757597Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7757967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7758102Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7758484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7758609Z output = model(*input) 2022-11-23T03:12:18.7758938Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7759087Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7759468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7759654Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7760099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7760125Z _lazy_init(state, module) 2022-11-23T03:12:18.7760477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7760624Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7760957Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7761089Z return func(*args, **kwargs) 2022-11-23T03:12:18.7761470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7761576Z p_assert( 2022-11-23T03:12:18.7761916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7762032Z traceback.print_stack() 2022-11-23T03:12:18.7762167Z File "", line 1, in 2022-11-23T03:12:18.7762382Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7762526Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7762731Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7762887Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7763102Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7763187Z self.run() 2022-11-23T03:12:18.7763397Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7763595Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7763990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7764034Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7764440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7764573Z getattr(self, test_name)() 2022-11-23T03:12:18.7764933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7765012Z fn() 2022-11-23T03:12:18.7765434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7765498Z test(self, **param_kwargs) 2022-11-23T03:12:18.7765850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7765974Z return func(*args, **kwargs) 2022-11-23T03:12:18.7766230Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7766393Z self.run_subtests( 2022-11-23T03:12:18.7766749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7766894Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7767257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7767410Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7767783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7767903Z output = model(*input) 2022-11-23T03:12:18.7768226Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7768367Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7768740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7768903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7769270Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7769391Z _lazy_init(state, module) 2022-11-23T03:12:18.7769738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7769882Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7770214Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7770339Z return func(*args, **kwargs) 2022-11-23T03:12:18.7770717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7770802Z p_assert( 2022-11-23T03:12:18.7771145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7771337Z traceback.print_stack() 2022-11-23T03:12:18.7771475Z File "", line 1, in 2022-11-23T03:12:18.7771688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7771837Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7772045Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7772201Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7772395Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7772504Z self.run() 2022-11-23T03:12:18.7772713Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7772864Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7773212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7773404Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7773782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7773886Z getattr(self, test_name)() 2022-11-23T03:12:18.7774252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7774355Z fn() 2022-11-23T03:12:18.7774725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7774856Z test(self, **param_kwargs) 2022-11-23T03:12:18.7775212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7775342Z return func(*args, **kwargs) 2022-11-23T03:12:18.7775601Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7775804Z self.run_subtests( 2022-11-23T03:12:18.7776168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7776336Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7776706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7776864Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7777245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7777370Z output = model(*input) 2022-11-23T03:12:18.7777697Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7777817Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7778196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7778385Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7778754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7778880Z _lazy_init(state, module) 2022-11-23T03:12:18.7779233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7779382Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7779722Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7779828Z return func(*args, **kwargs) 2022-11-23T03:12:18.7780207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7780312Z p_assert( 2022-11-23T03:12:18.7780664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7780796Z traceback.print_stack() 2022-11-23T03:12:18.7781043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7781286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7781516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7781728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7781865Z File "", line 1, in 2022-11-23T03:12:18.7782082Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7782230Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7782435Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7782596Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7782861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7782976Z self.run() 2022-11-23T03:12:18.7783160Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7783314Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7783663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7783802Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7784397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7784591Z getattr(self, test_name)() 2022-11-23T03:12:18.7784993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7785081Z fn() 2022-11-23T03:12:18.7785514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7785659Z test(self, **param_kwargs) 2022-11-23T03:12:18.7786002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7786151Z return func(*args, **kwargs) 2022-11-23T03:12:18.7786371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7787304Z self.run_subtests( 2022-11-23T03:12:18.7788065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7788218Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7788610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7788771Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7789190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7789316Z output = model(*input) 2022-11-23T03:12:18.7789650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7789799Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7790187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7790369Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7790720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7790844Z _lazy_init(state, module) 2022-11-23T03:12:18.7791197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7791350Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7791693Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7791820Z return func(*args, **kwargs) 2022-11-23T03:12:18.7792202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7792305Z p_assert( 2022-11-23T03:12:18.7792624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7792752Z traceback.print_stack() 2022-11-23T03:12:18.7792883Z File "", line 1, in 2022-11-23T03:12:18.7793096Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7793240Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7793445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7793600Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7794100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7794227Z self.run() 2022-11-23T03:12:18.7794434Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7794584Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7794927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7795064Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7795426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7795552Z getattr(self, test_name)() 2022-11-23T03:12:18.7795895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7795993Z fn() 2022-11-23T03:12:18.7796445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7796569Z test(self, **param_kwargs) 2022-11-23T03:12:18.7796928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7797054Z return func(*args, **kwargs) 2022-11-23T03:12:18.7797311Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7797426Z self.run_subtests( 2022-11-23T03:12:18.7798094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7798256Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7798619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7798776Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7799151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7799272Z output = model(*input) 2022-11-23T03:12:18.7799599Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7799741Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7800098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7800276Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7800643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7800766Z _lazy_init(state, module) 2022-11-23T03:12:18.7801115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7801264Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7801602Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7801728Z return func(*args, **kwargs) 2022-11-23T03:12:18.7802087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7802192Z p_assert( 2022-11-23T03:12:18.7802530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7802656Z traceback.print_stack() 2022-11-23T03:12:18.7802786Z File "", line 1, in 2022-11-23T03:12:18.7802996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7803143Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7803326Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7803534Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7803757Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7803863Z self.run() 2022-11-23T03:12:18.7804068Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7804218Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7804348Z File "", line 1, in 2022-11-23T03:12:18.7804692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7804806Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7805015Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7805157Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7805518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7805693Z getattr(self, test_name)() 2022-11-23T03:12:18.7805899Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7806052Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7806411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7806494Z fn() 2022-11-23T03:12:18.7806708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7806814Z self.run() 2022-11-23T03:12:18.7807181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7807305Z test(self, **param_kwargs) 2022-11-23T03:12:18.7807509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7807655Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7808003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7808130Z return func(*args, **kwargs) 2022-11-23T03:12:18.7808542Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7808824Z self.run_subtests( 2022-11-23T03:12:18.7809175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7809340Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7809675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7809811Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7810155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7810312Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7810681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7810805Z getattr(self, test_name)() 2022-11-23T03:12:18.7811182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7811302Z output = model(*input) 2022-11-23T03:12:18.7811659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7811759Z fn() 2022-11-23T03:12:18.7812067Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7812210Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7812572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7812699Z test(self, **param_kwargs) 2022-11-23T03:12:18.7813124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7813307Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7813668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7813794Z return func(*args, **kwargs) 2022-11-23T03:12:18.7814146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7814269Z _lazy_init(state, module) 2022-11-23T03:12:18.7814527Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7814644Z self.run_subtests( 2022-11-23T03:12:18.7815155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7815344Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7815688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7815847Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7816162Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7816283Z return func(*args, **kwargs) 2022-11-23T03:12:18.7816635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7816782Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7817337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7817443Z p_assert( 2022-11-23T03:12:18.7817814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7817941Z output = model(*input) 2022-11-23T03:12:18.7818257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7818383Z traceback.print_stack() 2022-11-23T03:12:18.7818710Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7818850Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7819225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7819402Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7819767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7819889Z _lazy_init(state, module) 2022-11-23T03:12:18.7820221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7820371Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7820708Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7820836Z return func(*args, **kwargs) 2022-11-23T03:12:18.7821213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7821317Z p_assert( 2022-11-23T03:12:18.7821651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7821779Z traceback.print_stack() 2022-11-23T03:12:18.7821999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7822237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7822471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7822755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7822895Z File "", line 1, in 2022-11-23T03:12:18.7823109Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7823253Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7823457Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7823591Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7824380Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7824496Z self.run() 2022-11-23T03:12:18.7824704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7824854Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7825208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7825433Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7825800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7825908Z getattr(self, test_name)() 2022-11-23T03:12:18.7826270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7826369Z fn() 2022-11-23T03:12:18.7826735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7826859Z test(self, **param_kwargs) 2022-11-23T03:12:18.7827212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7827338Z return func(*args, **kwargs) 2022-11-23T03:12:18.7827594Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7827697Z self.run_subtests( 2022-11-23T03:12:18.7828049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7828374Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7828722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7828871Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7829230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7829365Z output = model(*input) 2022-11-23T03:12:18.7829727Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7829846Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7830216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7830388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7830740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7830859Z _lazy_init(state, module) 2022-11-23T03:12:18.7831196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7831335Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7831659Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7831762Z return func(*args, **kwargs) 2022-11-23T03:12:18.7832129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7832232Z p_assert( 2022-11-23T03:12:18.7832789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7832927Z traceback.print_stack() 2022-11-23T03:12:18.7833059Z File "", line 1, in 2022-11-23T03:12:18.7833269Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7833392Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7833594Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7833744Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7833959Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7834062Z self.run() 2022-11-23T03:12:18.7834266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7834410Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7834809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7834926Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7835288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7835413Z getattr(self, test_name)() 2022-11-23T03:12:18.7835925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7836024Z fn() 2022-11-23T03:12:18.7836375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7836494Z test(self, **param_kwargs) 2022-11-23T03:12:18.7837024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7837130Z return func(*args, **kwargs) 2022-11-23T03:12:18.7837387Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7837509Z self.run_subtests( 2022-11-23T03:12:18.7837861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7838024Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7838387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7838538Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7838910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7839011Z output = model(*input) 2022-11-23T03:12:18.7839335Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7839477Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7839862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7840039Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7840567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7840686Z _lazy_init(state, module) 2022-11-23T03:12:18.7841023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7841143Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7841471Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7841593Z return func(*args, **kwargs) 2022-11-23T03:12:18.7841959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7842063Z p_assert( 2022-11-23T03:12:18.7842437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7842568Z traceback.print_stack() 2022-11-23T03:12:18.7842767Z File "", line 1, in 2022-11-23T03:12:18.7842957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7843095Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7843293Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7843621Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7843836Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7843942Z self.run() 2022-11-23T03:12:18.7844143Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7844269Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7844675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7844814Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7845176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7845300Z getattr(self, test_name)() 2022-11-23T03:12:18.7845657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7845757Z fn() 2022-11-23T03:12:18.7846118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7846222Z test(self, **param_kwargs) 2022-11-23T03:12:18.7846577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7846702Z return func(*args, **kwargs) 2022-11-23T03:12:18.7846966Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7847080Z self.run_subtests( 2022-11-23T03:12:18.7847431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7847596Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7847958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7848092Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7848465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7848585Z output = model(*input) 2022-11-23T03:12:18.7848908Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7849049Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7849434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7849609Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7849975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7850078Z _lazy_init(state, module) 2022-11-23T03:12:18.7850427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7850570Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7851063Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7851186Z return func(*args, **kwargs) 2022-11-23T03:12:18.7851549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7851654Z p_assert( 2022-11-23T03:12:18.7852024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7852136Z traceback.print_stack() 2022-11-23T03:12:18.7852262Z File "", line 1, in 2022-11-23T03:12:18.7852466Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7852606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7852802Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7852948Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7853153Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7853255Z self.run() 2022-11-23T03:12:18.7853435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7853577Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7853979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7854111Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7854462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7854582Z getattr(self, test_name)() 2022-11-23T03:12:18.7854929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7855007Z fn() 2022-11-23T03:12:18.7855361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7855480Z test(self, **param_kwargs) 2022-11-23T03:12:18.7855822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7855943Z return func(*args, **kwargs) 2022-11-23T03:12:18.7856374Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7856487Z self.run_subtests( 2022-11-23T03:12:18.7856839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7856985Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7857348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7857500Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7857873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7857992Z output = model(*input) 2022-11-23T03:12:18.7858314Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7858460Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7858840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7858999Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7859528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7859646Z _lazy_init(state, module) 2022-11-23T03:12:18.7859986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7860128Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7860635Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7860759Z return func(*args, **kwargs) 2022-11-23T03:12:18.7861137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7861244Z p_assert( 2022-11-23T03:12:18.7861609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7861744Z traceback.print_stack() 2022-11-23T03:12:18.7861983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7862220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7862452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7862684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7862814Z File "", line 1, in 2022-11-23T03:12:18.7863006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7863151Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7863405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7863721Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7864377Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7864499Z self.run() 2022-11-23T03:12:18.7864705Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7864851Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7865184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7865318Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7865678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7865804Z getattr(self, test_name)() 2022-11-23T03:12:18.7866161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7866267Z fn() 2022-11-23T03:12:18.7866636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7866759Z test(self, **param_kwargs) 2022-11-23T03:12:18.7867096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7867220Z return func(*args, **kwargs) 2022-11-23T03:12:18.7867478Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7867592Z self.run_subtests( 2022-11-23T03:12:18.7868099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7868257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7868607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7868763Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7869106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7869222Z output = model(*input) 2022-11-23T03:12:18.7869535Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7869671Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7870033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7870201Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7870551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7870670Z _lazy_init(state, module) 2022-11-23T03:12:18.7870994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7871209Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7871552Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7871674Z return func(*args, **kwargs) 2022-11-23T03:12:18.7872042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7872145Z p_assert( 2022-11-23T03:12:18.7872467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7872591Z traceback.print_stack() 2022-11-23T03:12:18.7872698Z File "", line 1, in 2022-11-23T03:12:18.7872899Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7873040Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7873303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7873452Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7873657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7873761Z self.run() 2022-11-23T03:12:18.7873937Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7874078Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7874411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7874541Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7874888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7875189Z getattr(self, test_name)() 2022-11-23T03:12:18.7875548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7875656Z fn() 2022-11-23T03:12:18.7876003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7876130Z test(self, **param_kwargs) 2022-11-23T03:12:18.7876486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7876612Z return func(*args, **kwargs) 2022-11-23T03:12:18.7876868Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7876983Z self.run_subtests( 2022-11-23T03:12:18.7877337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7877496Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7877842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7878002Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7878389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7878509Z output = model(*input) 2022-11-23T03:12:18.7878838Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7878979Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7879354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7879531Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7879879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7880000Z _lazy_init(state, module) 2022-11-23T03:12:18.7880404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7880556Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7880894Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7881019Z return func(*args, **kwargs) 2022-11-23T03:12:18.7881395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7881499Z p_assert( 2022-11-23T03:12:18.7881819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7881945Z traceback.print_stack() 2022-11-23T03:12:18.7882074Z File "", line 1, in 2022-11-23T03:12:18.7882282Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7882426Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7882682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7882835Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7883047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7883134Z self.run() 2022-11-23T03:12:18.7883336Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7883482Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7883820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7883955Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7884470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7884592Z getattr(self, test_name)() 2022-11-23T03:12:18.7884918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7885021Z fn() 2022-11-23T03:12:18.7885553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7885679Z test(self, **param_kwargs) 2022-11-23T03:12:18.7886034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7886159Z return func(*args, **kwargs) 2022-11-23T03:12:18.7886414Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7886528Z self.run_subtests( 2022-11-23T03:12:18.7886861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7887024Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7887446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7887611Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7887986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7888106Z output = model(*input) 2022-11-23T03:12:18.7888428Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7888570Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7888925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7889100Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7889464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7889586Z _lazy_init(state, module) 2022-11-23T03:12:18.7889989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7890139Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7890268Z File "", line 1, in 2022-11-23T03:12:18.7890606Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7890712Z return func(*args, **kwargs) 2022-11-23T03:12:18.7891087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7891189Z p_assert( 2022-11-23T03:12:18.7891550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7891689Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7892013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7892186Z traceback.print_stack() 2022-11-23T03:12:18.7892385Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7892514Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7892720Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7892822Z self.run() 2022-11-23T03:12:18.7893020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7893162Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7893490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7893800Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7894143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7894266Z getattr(self, test_name)() 2022-11-23T03:12:18.7894625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7894728Z fn() 2022-11-23T03:12:18.7895088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7895208Z test(self, **param_kwargs) 2022-11-23T03:12:18.7895559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7895680Z return func(*args, **kwargs) 2022-11-23T03:12:18.7895916Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7896029Z self.run_subtests( 2022-11-23T03:12:18.7896379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7896539Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7896898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7897055Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7897426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7897543Z output = model(*input) 2022-11-23T03:12:18.7898011Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7898148Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7898506Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7898675Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7899029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7899148Z _lazy_init(state, module) 2022-11-23T03:12:18.7899530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7899674Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7899985Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7900106Z return func(*args, **kwargs) 2022-11-23T03:12:18.7900654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7900756Z p_assert( 2022-11-23T03:12:18.7901089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7901215Z traceback.print_stack() 2022-11-23T03:12:18.7901452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7901684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7901953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7902183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7902313Z File "", line 1, in 2022-11-23T03:12:18.7902519Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7902660Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7902858Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7903007Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7903217Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7903303Z self.run() 2022-11-23T03:12:18.7903503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7903649Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7904375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7904514Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7904867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7904986Z getattr(self, test_name)() 2022-11-23T03:12:18.7905327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7905404Z fn() 2022-11-23T03:12:18.7905754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7905871Z test(self, **param_kwargs) 2022-11-23T03:12:18.7906211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7906328Z return func(*args, **kwargs) 2022-11-23T03:12:18.7906579Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7906689Z self.run_subtests( 2022-11-23T03:12:18.7907026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7907164Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7907514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7907659Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7908016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7908131Z output = model(*input) 2022-11-23T03:12:18.7908442Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7908581Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7909009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7909171Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7909525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7909644Z _lazy_init(state, module) 2022-11-23T03:12:18.7909983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7910120Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7910440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7910558Z return func(*args, **kwargs) 2022-11-23T03:12:18.7910921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7911069Z p_assert( 2022-11-23T03:12:18.7911399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7911521Z traceback.print_stack() 2022-11-23T03:12:18.7911645Z File "", line 1, in 2022-11-23T03:12:18.7911848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7911985Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7912180Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7912308Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7912697Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7912800Z self.run() 2022-11-23T03:12:18.7913002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7913152Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7913496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7913628Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7913987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7914092Z getattr(self, test_name)() 2022-11-23T03:12:18.7914450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7914545Z fn() 2022-11-23T03:12:18.7914909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7915029Z test(self, **param_kwargs) 2022-11-23T03:12:18.7915541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7915664Z return func(*args, **kwargs) 2022-11-23T03:12:18.7915914Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7916007Z self.run_subtests( 2022-11-23T03:12:18.7916346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7916502Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7916849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7916994Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7917534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7917654Z output = model(*input) 2022-11-23T03:12:18.7917977Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7918103Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7918528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7918711Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7919078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7919198Z _lazy_init(state, module) 2022-11-23T03:12:18.7919545Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7919686Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7920019Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7920124Z return func(*args, **kwargs) 2022-11-23T03:12:18.7920497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7920665Z p_assert( 2022-11-23T03:12:18.7921003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7921126Z traceback.print_stack() 2022-11-23T03:12:18.7921253Z File "", line 1, in 2022-11-23T03:12:18.7921458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7921598Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7921781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7921931Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7922140Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7922243Z self.run() 2022-11-23T03:12:18.7922444Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7922591Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7922932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7923046Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7923560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7923677Z getattr(self, test_name)() 2022-11-23T03:12:18.7924020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7924112Z fn() 2022-11-23T03:12:18.7924460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7924576Z test(self, **param_kwargs) 2022-11-23T03:12:18.7924917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7925025Z return func(*args, **kwargs) 2022-11-23T03:12:18.7925273Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7925381Z self.run_subtests( 2022-11-23T03:12:18.7925900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7926063Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7926420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7926570Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7926942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7927043Z output = model(*input) 2022-11-23T03:12:18.7927367Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7927510Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7927930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7928113Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7928477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7928753Z _lazy_init(state, module) 2022-11-23T03:12:18.7929089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7929209Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7929587Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7929708Z return func(*args, **kwargs) 2022-11-23T03:12:18.7930072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7930224Z p_assert( 2022-11-23T03:12:18.7930551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7930671Z traceback.print_stack() 2022-11-23T03:12:18.7930794Z File "", line 1, in 2022-11-23T03:12:18.7930978Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7931114Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7931308Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7931453Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7931657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7931756Z self.run() 2022-11-23T03:12:18.7931952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7932079Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7932407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7932536Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7932881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7933172Z getattr(self, test_name)() 2022-11-23T03:12:18.7933526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7933622Z fn() 2022-11-23T03:12:18.7933983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7934087Z test(self, **param_kwargs) 2022-11-23T03:12:18.7934440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7934567Z return func(*args, **kwargs) 2022-11-23T03:12:18.7934824Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7934940Z self.run_subtests( 2022-11-23T03:12:18.7935291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7935452Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7935813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7936102Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7936459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7936571Z output = model(*input) 2022-11-23T03:12:18.7937060Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7937253Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7937637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7937810Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7938172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7938292Z _lazy_init(state, module) 2022-11-23T03:12:18.7938622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7938764Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7939099Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7939222Z return func(*args, **kwargs) 2022-11-23T03:12:18.7939672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7939774Z p_assert( 2022-11-23T03:12:18.7940107Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7940215Z traceback.print_stack() 2022-11-23T03:12:18.7940451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7940843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7941066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7941286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7941412Z File "", line 1, in 2022-11-23T03:12:18.7941611Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7941752Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7941932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7942249Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7942460Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7942563Z self.run() 2022-11-23T03:12:18.7942762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7942904Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7943242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7943374Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7943715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7943838Z getattr(self, test_name)() 2022-11-23T03:12:18.7944440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7944536Z fn() 2022-11-23T03:12:18.7944896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7945178Z test(self, **param_kwargs) 2022-11-23T03:12:18.7945702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7945826Z return func(*args, **kwargs) 2022-11-23T03:12:18.7946066Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7946177Z self.run_subtests( 2022-11-23T03:12:18.7946526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7946688Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7947134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7947294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7947668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7947785Z output = model(*input) 2022-11-23T03:12:18.7948090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7948230Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7948603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7948777Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7949138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7949483Z _lazy_init(state, module) 2022-11-23T03:12:18.7949823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7949962Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7950268Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7950387Z return func(*args, **kwargs) 2022-11-23T03:12:18.7950749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7950846Z p_assert( 2022-11-23T03:12:18.7951169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7951291Z traceback.print_stack() 2022-11-23T03:12:18.7951414Z File "", line 1, in 2022-11-23T03:12:18.7951613Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7951738Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7951934Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7952078Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7952282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7952381Z self.run() 2022-11-23T03:12:18.7952577Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7952719Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7953206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7953339Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7953698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7953821Z getattr(self, test_name)() 2022-11-23T03:12:18.7954182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7954278Z fn() 2022-11-23T03:12:18.7954641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7954762Z test(self, **param_kwargs) 2022-11-23T03:12:18.7955096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7955219Z return func(*args, **kwargs) 2022-11-23T03:12:18.7955471Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7955584Z self.run_subtests( 2022-11-23T03:12:18.7956091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7956247Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7956829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7956988Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7957346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7957464Z output = model(*input) 2022-11-23T03:12:18.7957787Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7957926Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7958294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7958467Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7958830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7958998Z _lazy_init(state, module) 2022-11-23T03:12:18.7959331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7959475Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7959966Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7960085Z return func(*args, **kwargs) 2022-11-23T03:12:18.7960449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7960547Z p_assert( 2022-11-23T03:12:18.7961057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7961182Z traceback.print_stack() 2022-11-23T03:12:18.7961293Z File "", line 1, in 2022-11-23T03:12:18.7961500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7961644Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7961848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7961997Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7962204Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7962305Z self.run() 2022-11-23T03:12:18.7962488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7962631Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7962966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7963099Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7963456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7963578Z getattr(self, test_name)() 2022-11-23T03:12:18.7963708Z File "", line 1, in 2022-11-23T03:12:18.7964396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7964478Z fn() 2022-11-23T03:12:18.7964838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7964961Z test(self, **param_kwargs) 2022-11-23T03:12:18.7965169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7965312Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7965666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7965790Z return func(*args, **kwargs) 2022-11-23T03:12:18.7965992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7966125Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7966428Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7966548Z self.run_subtests( 2022-11-23T03:12:18.7966758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7966860Z self.run() 2022-11-23T03:12:18.7967211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7967373Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7967574Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7967700Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7968058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7968209Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7968595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7968728Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7969100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7969218Z output = model(*input) 2022-11-23T03:12:18.7969577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7969682Z getattr(self, test_name)() 2022-11-23T03:12:18.7970002Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7970144Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7970499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7970596Z fn() 2022-11-23T03:12:18.7971125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7971294Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7971644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7971745Z test(self, **param_kwargs) 2022-11-23T03:12:18.7972096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7972213Z _lazy_init(state, module) 2022-11-23T03:12:18.7972553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7972673Z return func(*args, **kwargs) 2022-11-23T03:12:18.7973007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7973149Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7973399Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7973491Z self.run_subtests( 2022-11-23T03:12:18.7974001Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7974125Z return func(*args, **kwargs) 2022-11-23T03:12:18.7974472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7974633Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7975005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7975106Z p_assert( 2022-11-23T03:12:18.7975467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7975607Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7975988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7976120Z traceback.print_stack() 2022-11-23T03:12:18.7976494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7976612Z output = model(*input) 2022-11-23T03:12:18.7976932Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7977071Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7977429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7977604Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7977966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7978147Z _lazy_init(state, module) 2022-11-23T03:12:18.7978498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7978640Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7978973Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7979094Z return func(*args, **kwargs) 2022-11-23T03:12:18.7979469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7979554Z p_assert( 2022-11-23T03:12:18.7979889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7980013Z traceback.print_stack() 2022-11-23T03:12:18.7980250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7980490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7980718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7980949Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.7981059Z File "", line 1, in 2022-11-23T03:12:18.7981268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7981407Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7981610Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7981759Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7981968Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7982070Z self.run() 2022-11-23T03:12:18.7982271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7982404Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7982746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7982878Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7983239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7983359Z getattr(self, test_name)() 2022-11-23T03:12:18.7983714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7983810Z fn() 2022-11-23T03:12:18.7984420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7984527Z test(self, **param_kwargs) 2022-11-23T03:12:18.7985031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7985154Z return func(*args, **kwargs) 2022-11-23T03:12:18.7985473Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7985589Z self.run_subtests( 2022-11-23T03:12:18.7985930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7986085Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7986434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7986565Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7986924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7987039Z output = model(*input) 2022-11-23T03:12:18.7987574Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7987802Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7988180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7988354Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7988718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7988820Z _lazy_init(state, module) 2022-11-23T03:12:18.7989168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7989310Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7989645Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7989768Z return func(*args, **kwargs) 2022-11-23T03:12:18.7990152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7990253Z p_assert( 2022-11-23T03:12:18.7990584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.7990692Z traceback.print_stack() 2022-11-23T03:12:18.7990820Z File "", line 1, in 2022-11-23T03:12:18.7991027Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.7991167Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.7991367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.7991520Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.7991882Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.7991965Z self.run() 2022-11-23T03:12:18.7992160Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.7992305Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.7992632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.7992759Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.7993103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.7993221Z getattr(self, test_name)() 2022-11-23T03:12:18.7993562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.7993639Z fn() 2022-11-23T03:12:18.7994155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.7994280Z test(self, **param_kwargs) 2022-11-23T03:12:18.7994631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.7994806Z return func(*args, **kwargs) 2022-11-23T03:12:18.7995069Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.7995181Z self.run_subtests( 2022-11-23T03:12:18.7995531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.7995677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.7996040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.7996191Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.7996562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.7996679Z output = model(*input) 2022-11-23T03:12:18.7997054Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.7997198Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.7997573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.7997732Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.7998094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.7998377Z _lazy_init(state, module) 2022-11-23T03:12:18.7998715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.7998851Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.7999171Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.7999292Z return func(*args, **kwargs) 2022-11-23T03:12:18.7999664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.7999746Z p_assert( 2022-11-23T03:12:18.8000071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8000191Z traceback.print_stack() 2022-11-23T03:12:18.8000315Z File "", line 1, in 2022-11-23T03:12:18.8000517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8000652Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8001015Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8001166Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8001360Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8001463Z self.run() 2022-11-23T03:12:18.8001670Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8001817Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8002151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8002284Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8002642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8002746Z getattr(self, test_name)() 2022-11-23T03:12:18.8003101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8003198Z fn() 2022-11-23T03:12:18.8003557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8003679Z test(self, **param_kwargs) 2022-11-23T03:12:18.8004178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8004347Z return func(*args, **kwargs) 2022-11-23T03:12:18.8004604Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8004695Z self.run_subtests( 2022-11-23T03:12:18.8005034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8005189Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8005536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8005681Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8006038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8006153Z output = model(*input) 2022-11-23T03:12:18.8006520Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8006639Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8006998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8007167Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8007519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8007635Z _lazy_init(state, module) 2022-11-23T03:12:18.8007970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8008108Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8008429Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8008536Z return func(*args, **kwargs) 2022-11-23T03:12:18.8008903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8009004Z p_assert( 2022-11-23T03:12:18.8009327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8009449Z traceback.print_stack() 2022-11-23T03:12:18.8009573Z File "", line 1, in 2022-11-23T03:12:18.8009774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8009910Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8010089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8010234Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8010435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8010535Z self.run() 2022-11-23T03:12:18.8010736Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8010875Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8011200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8011327Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8011656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8011774Z getattr(self, test_name)() 2022-11-23T03:12:18.8012118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8012214Z fn() 2022-11-23T03:12:18.8012563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8012681Z test(self, **param_kwargs) 2022-11-23T03:12:18.8013211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8013387Z return func(*args, **kwargs) 2022-11-23T03:12:18.8013630Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8013744Z self.run_subtests( 2022-11-23T03:12:18.8014094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8014255Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8014616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8014768Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8015141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8015260Z output = model(*input) 2022-11-23T03:12:18.8015779Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8015917Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8016278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8016448Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8016800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8016917Z _lazy_init(state, module) 2022-11-23T03:12:18.8017254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8017391Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8017876Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8018003Z return func(*args, **kwargs) 2022-11-23T03:12:18.8018380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8018482Z p_assert( 2022-11-23T03:12:18.8018817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8018943Z traceback.print_stack() 2022-11-23T03:12:18.8019182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8019420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8019634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8019869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8020101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8020340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8020568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8020797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8021026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8021255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8021460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8021689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8021919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8022149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8022422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8022652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8022875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8023098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8023301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8023525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8023748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8024539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8024762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8025065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8025290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8025517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8025737Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8025941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8026166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8026389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8026607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8026835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8027057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8027282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8027505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8027708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8027929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8028156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8028381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8028605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8028834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8029217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8029430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8029670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8029889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8030104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8030318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8030532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8030746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8031026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8031251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8031465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8031664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8031879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8032092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8032307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8032520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8032736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8033001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8033214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8033587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8033810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8034031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8034251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8034473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8034694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8034918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8035141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8035363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8035569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8035788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8036008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8036389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8036780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8037003Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8037233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8037455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8037660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8037882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8038104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8038325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8038547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8038768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8038988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8039260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8039473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8039695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8039918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8040141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8040364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8040584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8040806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8041187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8041453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8041650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8041865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8041973Z dist init r=1, world=4 2022-11-23T03:12:18.8042295Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8042602Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8042901Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8043200Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8043491Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8043781Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8044070Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8044344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8044636Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8044928Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8045218Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8045508Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8045615Z dist init r=0, world=4 2022-11-23T03:12:18.8046109Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8046469Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8046790Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8047095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8047397Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8047680Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8047981Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8048346Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8048645Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8048942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8049241Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8049538Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8049653Z dist init r=3, world=4 2022-11-23T03:12:18.8049978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8050288Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8050594Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8050897Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8051182Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8051490Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8051789Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8052248Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8052537Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8052826Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8053161Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8053463Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8053569Z dist init r=2, world=4 2022-11-23T03:12:18.8053879Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8054186Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8054464Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8054761Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8055104Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8055396Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8055870Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8056170Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8056469Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8056774Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8057073Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8057372Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8057473Z ok (7.224s) 2022-11-23T03:12:18.8057816Z test_nested_always_wrap_model_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24065 2022-11-23T03:12:18.8058035Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24066 2022-11-23T03:12:18.8058252Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24067 2022-11-23T03:12:18.8058468Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24068 2022-11-23T03:12:18.8058862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8059036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8059413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8059602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8059966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8060123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8060502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8060745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8061121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8061294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8061664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8061852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8062213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8062368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8062743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8062986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8063231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.8063473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.8063712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.8064144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.8064890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8065286Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8065757Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8066152Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8066379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.8066605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.8066829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.8067052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.8067284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8067514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8067742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8067960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8069116Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8069228Z warnings.warn( 2022-11-23T03:12:18.8070273Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8070392Z warnings.warn( 2022-11-23T03:12:18.8071578Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8071687Z warnings.warn( 2022-11-23T03:12:18.8072696Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8072867Z warnings.warn( 2022-11-23T03:12:18.8072997Z File "", line 1, in 2022-11-23T03:12:18.8073211Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8073353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8073557Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8073689Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8073904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8074012Z self.run() 2022-11-23T03:12:18.8074216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8074369Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8074723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8074863Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8075231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8075338Z getattr(self, test_name)() 2022-11-23T03:12:18.8075854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8075955Z fn() 2022-11-23T03:12:18.8076312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8076435Z test(self, **param_kwargs) 2022-11-23T03:12:18.8076780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8076905Z return func(*args, **kwargs) 2022-11-23T03:12:18.8077344Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8077441Z self.run_subtests( 2022-11-23T03:12:18.8077796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8077961Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8078326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8078485Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8078862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8078984Z output = model(*input) 2022-11-23T03:12:18.8079316Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8079440Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8079885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8080072Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8080436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8080556Z _lazy_init(state, module) 2022-11-23T03:12:18.8080903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8081045Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8081383Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8081488Z return func(*args, **kwargs) 2022-11-23T03:12:18.8082195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8082347Z p_assert( 2022-11-23T03:12:18.8082690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8082816Z traceback.print_stack() 2022-11-23T03:12:18.8082949Z File "", line 1, in 2022-11-23T03:12:18.8083165Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8083288Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8083491Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8083644Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8083859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8083966Z self.run() 2022-11-23T03:12:18.8084170Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8084318Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8084668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8084783Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8085298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8085422Z getattr(self, test_name)() 2022-11-23T03:12:18.8085773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8085871Z fn() 2022-11-23T03:12:18.8086221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8086346Z test(self, **param_kwargs) 2022-11-23T03:12:18.8086689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8086793Z return func(*args, **kwargs) 2022-11-23T03:12:18.8087051Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8087163Z self.run_subtests( 2022-11-23T03:12:18.8087570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8087908Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8088273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8088427Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8088801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8088903Z output = model(*input) 2022-11-23T03:12:18.8089236Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8089383Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8089821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8090006Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8090374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8090499Z _lazy_init(state, module) 2022-11-23T03:12:18.8090848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8090973Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8091315Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8091442Z return func(*args, **kwargs) 2022-11-23T03:12:18.8091819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8091976Z p_assert( 2022-11-23T03:12:18.8092314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8092441Z traceback.print_stack() 2022-11-23T03:12:18.8092567Z File "", line 1, in 2022-11-23T03:12:18.8092759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8092902Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8093107Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8093262Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8093478Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8093587Z self.run() 2022-11-23T03:12:18.8093952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8094075Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8094591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8094730Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8095091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8095216Z getattr(self, test_name)() 2022-11-23T03:12:18.8095576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8095677Z fn() 2022-11-23T03:12:18.8096045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8096151Z test(self, **param_kwargs) 2022-11-23T03:12:18.8096509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8096635Z return func(*args, **kwargs) 2022-11-23T03:12:18.8096901Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8097018Z self.run_subtests( 2022-11-23T03:12:18.8097374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8097539Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8097904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8098038Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8098415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8098538Z output = model(*input) 2022-11-23T03:12:18.8098870Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8099017Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8099446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8099632Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8100000Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8100104Z _lazy_init(state, module) 2022-11-23T03:12:18.8100457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8100601Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8101104Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8101405Z return func(*args, **kwargs) 2022-11-23T03:12:18.8101786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8101999Z p_assert( 2022-11-23T03:12:18.8102343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8102452Z traceback.print_stack() 2022-11-23T03:12:18.8102588Z File "", line 1, in 2022-11-23T03:12:18.8102804Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8102951Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8103156Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8103310Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8103525Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8103634Z self.run() 2022-11-23T03:12:18.8103818Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8104182Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8104537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8104673Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8105037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8105165Z getattr(self, test_name)() 2022-11-23T03:12:18.8105524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8105605Z fn() 2022-11-23T03:12:18.8105970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8106096Z test(self, **param_kwargs) 2022-11-23T03:12:18.8106455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8106586Z return func(*args, **kwargs) 2022-11-23T03:12:18.8106848Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8106963Z self.run_subtests( 2022-11-23T03:12:18.8107319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8107463Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8107828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8107987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8108362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8108484Z output = model(*input) 2022-11-23T03:12:18.8109139Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8109289Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8109737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8109903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8110275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8110401Z _lazy_init(state, module) 2022-11-23T03:12:18.8110748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8110888Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8111221Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8111344Z return func(*args, **kwargs) 2022-11-23T03:12:18.8111718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8111884Z p_assert( 2022-11-23T03:12:18.8112222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8112348Z traceback.print_stack() 2022-11-23T03:12:18.8112584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8112821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8113050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8113280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8113408Z File "", line 1, in 2022-11-23T03:12:18.8113599Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8113739Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8113948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8114096Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8114304Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8114406Z self.run() 2022-11-23T03:12:18.8114607Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8114751Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8115077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8115209Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8115572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8115695Z getattr(self, test_name)() 2022-11-23T03:12:18.8116208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8116307Z fn() 2022-11-23T03:12:18.8116663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8116783Z test(self, **param_kwargs) 2022-11-23T03:12:18.8117108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8117228Z return func(*args, **kwargs) 2022-11-23T03:12:18.8117475Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8117585Z self.run_subtests( 2022-11-23T03:12:18.8118102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8118265Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8118627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8118830Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8119199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8119318Z output = model(*input) 2022-11-23T03:12:18.8119641Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8119782Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8120155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8120331Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8120694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8120813Z _lazy_init(state, module) 2022-11-23T03:12:18.8121198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8121341Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8121676Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8121799Z return func(*args, **kwargs) 2022-11-23T03:12:18.8122174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8122275Z p_assert( 2022-11-23T03:12:18.8122610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8122736Z traceback.print_stack() 2022-11-23T03:12:18.8122847Z File "", line 1, in 2022-11-23T03:12:18.8123055Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8123198Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8123406Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8123557Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8123769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8123875Z self.run() 2022-11-23T03:12:18.8124058Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8124202Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8124696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8124826Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8125171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8125289Z getattr(self, test_name)() 2022-11-23T03:12:18.8125636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8125738Z fn() 2022-11-23T03:12:18.8126074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8126373Z test(self, **param_kwargs) 2022-11-23T03:12:18.8126727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8126849Z return func(*args, **kwargs) 2022-11-23T03:12:18.8127103Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8127217Z self.run_subtests( 2022-11-23T03:12:18.8127565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8127728Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8128070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8128271Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8128650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8128770Z output = model(*input) 2022-11-23T03:12:18.8129092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8129232Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8129801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8129971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8130304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8130471Z _lazy_init(state, module) 2022-11-23T03:12:18.8130814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8130954Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8131277Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8131397Z return func(*args, **kwargs) 2022-11-23T03:12:18.8131755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8131854Z p_assert( 2022-11-23T03:12:18.8132158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8132278Z traceback.print_stack() 2022-11-23T03:12:18.8132401Z File "", line 1, in 2022-11-23T03:12:18.8132600Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8132741Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8132937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8133082Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8133285Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8133367Z self.run() 2022-11-23T03:12:18.8133561Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8133870Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8134211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8134343Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8134701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8134823Z getattr(self, test_name)() 2022-11-23T03:12:18.8135162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8135265Z fn() 2022-11-23T03:12:18.8135628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8135750Z test(self, **param_kwargs) 2022-11-23T03:12:18.8136105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8136228Z return func(*args, **kwargs) 2022-11-23T03:12:18.8136482Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8136593Z self.run_subtests( 2022-11-23T03:12:18.8137250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8137410Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8137824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8137983Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8138357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8138476Z output = model(*input) 2022-11-23T03:12:18.8138799Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8138938Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8139294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8139469Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8139830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8139999Z _lazy_init(state, module) 2022-11-23T03:12:18.8140351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8140493Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8140828Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8140951Z return func(*args, **kwargs) 2022-11-23T03:12:18.8141472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8141571Z p_assert( 2022-11-23T03:12:18.8141892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8142013Z traceback.print_stack() 2022-11-23T03:12:18.8142137Z File "", line 1, in 2022-11-23T03:12:18.8142338Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8142478Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8142712Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8142866Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8143070Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8143170Z self.run() 2022-11-23T03:12:18.8143363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8143504Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8144219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8144360Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8144709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8144830Z getattr(self, test_name)() 2022-11-23T03:12:18.8145190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8145287Z fn() 2022-11-23T03:12:18.8145647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8145768Z test(self, **param_kwargs) 2022-11-23T03:12:18.8146121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8146244Z return func(*args, **kwargs) 2022-11-23T03:12:18.8146480Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8146596Z self.run_subtests( 2022-11-23T03:12:18.8146944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8147104Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8147544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8147711Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8148087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8148205Z output = model(*input) 2022-11-23T03:12:18.8148513Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8148653Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8149025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8149200Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8149562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8149745Z _lazy_init(state, module) 2022-11-23T03:12:18.8150098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8150242Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8150577Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8150683Z return func(*args, **kwargs) 2022-11-23T03:12:18.8151056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8151159Z p_assert( 2022-11-23T03:12:18.8151643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8151765Z traceback.print_stack() 2022-11-23T03:12:18.8151993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8152220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8152434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8152659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8152783Z File "", line 1, in 2022-11-23T03:12:18.8152985Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8153121Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8153314Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8153460Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8153664Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8153746Z self.run() 2022-11-23T03:12:18.8153939Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8154084Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8154412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8154538Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8154883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8155001Z getattr(self, test_name)() 2022-11-23T03:12:18.8155344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8155420Z fn() 2022-11-23T03:12:18.8155768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8155886Z test(self, **param_kwargs) 2022-11-23T03:12:18.8156226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8156347Z return func(*args, **kwargs) 2022-11-23T03:12:18.8156639Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8156756Z self.run_subtests( 2022-11-23T03:12:18.8157271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8157416Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8157777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8157929Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8158304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8158423Z output = model(*input) 2022-11-23T03:12:18.8158744Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8158933Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8159309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8159467Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8159832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8159953Z _lazy_init(state, module) 2022-11-23T03:12:18.8160462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8160600Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8160924Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8161043Z return func(*args, **kwargs) 2022-11-23T03:12:18.8161591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8161682Z p_assert( 2022-11-23T03:12:18.8162019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8162144Z traceback.print_stack() 2022-11-23T03:12:18.8162273Z File "", line 1, in 2022-11-23T03:12:18.8162482Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8162622Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8162823Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8162955Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8163165Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8163269Z self.run() 2022-11-23T03:12:18.8163472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8163620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8163958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8164091Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8164449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8164554Z getattr(self, test_name)() 2022-11-23T03:12:18.8165064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8165157Z fn() 2022-11-23T03:12:18.8165507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8165794Z test(self, **param_kwargs) 2022-11-23T03:12:18.8166148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8166275Z return func(*args, **kwargs) 2022-11-23T03:12:18.8166573Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8166675Z self.run_subtests( 2022-11-23T03:12:18.8167028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8167190Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8167548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8167700Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8168072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8168191Z output = model(*input) 2022-11-23T03:12:18.8168674Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8168844Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8169207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8169375Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8169725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8169841Z _lazy_init(state, module) 2022-11-23T03:12:18.8170175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8170311Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8170634Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8170736Z return func(*args, **kwargs) 2022-11-23T03:12:18.8171104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8171203Z p_assert( 2022-11-23T03:12:18.8171524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8171645Z traceback.print_stack() 2022-11-23T03:12:18.8171769Z File "", line 1, in 2022-11-23T03:12:18.8171971Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8172106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8172282Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8172427Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8172632Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8172736Z self.run() 2022-11-23T03:12:18.8172933Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8173078Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8173407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8173518Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8173866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8173983Z getattr(self, test_name)() 2022-11-23T03:12:18.8174325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8174421Z fn() 2022-11-23T03:12:18.8174767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8174885Z test(self, **param_kwargs) 2022-11-23T03:12:18.8175224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8175506Z return func(*args, **kwargs) 2022-11-23T03:12:18.8175810Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8175932Z self.run_subtests( 2022-11-23T03:12:18.8176286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8176447Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8176806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8176960Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8177331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8177432Z output = model(*input) 2022-11-23T03:12:18.8177755Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8177964Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8178337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8178511Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8178875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8178996Z _lazy_init(state, module) 2022-11-23T03:12:18.8179341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8179467Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8179800Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8179923Z return func(*args, **kwargs) 2022-11-23T03:12:18.8180306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8180408Z p_assert( 2022-11-23T03:12:18.8180741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8180866Z traceback.print_stack() 2022-11-23T03:12:18.8180995Z File "", line 1, in 2022-11-23T03:12:18.8181186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8181326Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8181527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8181677Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8181887Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8181990Z self.run() 2022-11-23T03:12:18.8182190Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8182667Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8182989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8183121Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8183476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8183599Z getattr(self, test_name)() 2022-11-23T03:12:18.8184158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8184265Z fn() 2022-11-23T03:12:18.8184634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8184739Z test(self, **param_kwargs) 2022-11-23T03:12:18.8185094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8185289Z return func(*args, **kwargs) 2022-11-23T03:12:18.8185716Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8185999Z self.run_subtests( 2022-11-23T03:12:18.8186353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8186513Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8186876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8187028Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8187431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8187550Z output = model(*input) 2022-11-23T03:12:18.8187943Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8188088Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8188465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8188640Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8189003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8189124Z _lazy_init(state, module) 2022-11-23T03:12:18.8189455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8189596Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8189930Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8190054Z return func(*args, **kwargs) 2022-11-23T03:12:18.8190438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8190540Z p_assert( 2022-11-23T03:12:18.8190877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8191003Z traceback.print_stack() 2022-11-23T03:12:18.8191221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8191456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8191687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8191917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8192048Z File "", line 1, in 2022-11-23T03:12:18.8192414Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8192558Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8192738Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8192884Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8193090Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8193191Z self.run() 2022-11-23T03:12:18.8193386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8193525Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8193851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8193979Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8194312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8194431Z getattr(self, test_name)() 2022-11-23T03:12:18.8195007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8195110Z fn() 2022-11-23T03:12:18.8195473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8195596Z test(self, **param_kwargs) 2022-11-23T03:12:18.8195948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8196074Z return func(*args, **kwargs) 2022-11-23T03:12:18.8196314Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8196428Z self.run_subtests( 2022-11-23T03:12:18.8196776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8196938Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8197352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8197504Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8197876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8197997Z output = model(*input) 2022-11-23T03:12:18.8198305Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8198445Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8198978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8199149Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8199501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8199620Z _lazy_init(state, module) 2022-11-23T03:12:18.8199957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8200098Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8200405Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8200525Z return func(*args, **kwargs) 2022-11-23T03:12:18.8200888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8200986Z p_assert( 2022-11-23T03:12:18.8201308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8201608Z traceback.print_stack() 2022-11-23T03:12:18.8201737Z File "", line 1, in 2022-11-23T03:12:18.8201944Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8202078Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8202280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8202431Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8202641Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8202743Z self.run() 2022-11-23T03:12:18.8202944Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8203087Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8203407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8203540Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8203901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8204026Z getattr(self, test_name)() 2022-11-23T03:12:18.8204431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8204698Z fn() 2022-11-23T03:12:18.8205054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8205172Z test(self, **param_kwargs) 2022-11-23T03:12:18.8205500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8205620Z return func(*args, **kwargs) 2022-11-23T03:12:18.8205866Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8205976Z self.run_subtests( 2022-11-23T03:12:18.8206313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8206469Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8206868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8207016Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8207361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8207480Z output = model(*input) 2022-11-23T03:12:18.8207792Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8207928Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8208285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8208454Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8208803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8208926Z _lazy_init(state, module) 2022-11-23T03:12:18.8209247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8209387Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8209710Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8209829Z return func(*args, **kwargs) 2022-11-23T03:12:18.8210190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8210289Z p_assert( 2022-11-23T03:12:18.8210612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8210734Z traceback.print_stack() 2022-11-23T03:12:18.8210841Z File "", line 1, in 2022-11-23T03:12:18.8211043Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8211186Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8211381Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8211525Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8211730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8211830Z self.run() 2022-11-23T03:12:18.8212029Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8212152Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8212476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8212603Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8212947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8213072Z getattr(self, test_name)() 2022-11-23T03:12:18.8213642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8213747Z fn() 2022-11-23T03:12:18.8214094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8214218Z test(self, **param_kwargs) 2022-11-23T03:12:18.8214570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8214693Z return func(*args, **kwargs) 2022-11-23T03:12:18.8214946Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8215058Z self.run_subtests( 2022-11-23T03:12:18.8215411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8215622Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8215970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8216125Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8216657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8216772Z output = model(*input) 2022-11-23T03:12:18.8217084Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8217219Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8217576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8217744Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8218095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8218201Z _lazy_init(state, module) 2022-11-23T03:12:18.8218720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8218864Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8219197Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8219320Z return func(*args, **kwargs) 2022-11-23T03:12:18.8219694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8219795Z p_assert( 2022-11-23T03:12:18.8220110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8220236Z traceback.print_stack() 2022-11-23T03:12:18.8220365Z File "", line 1, in 2022-11-23T03:12:18.8220577Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8220721Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8220922Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8221071Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8221282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8221367Z self.run() 2022-11-23T03:12:18.8221568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8222044Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8222385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8222519Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8222875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8223001Z getattr(self, test_name)() 2022-11-23T03:12:18.8223402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8223490Z fn() 2022-11-23T03:12:18.8224052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8224194Z test(self, **param_kwargs) 2022-11-23T03:12:18.8224554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8224678Z return func(*args, **kwargs) 2022-11-23T03:12:18.8225087Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8225198Z self.run_subtests( 2022-11-23T03:12:18.8225539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8225783Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8226140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8226289Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8226830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8226945Z output = model(*input) 2022-11-23T03:12:18.8227268Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8227408Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8227782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8227939Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8228302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8228432Z _lazy_init(state, module) 2022-11-23T03:12:18.8228780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8228921Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8229256Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8229378Z return func(*args, **kwargs) 2022-11-23T03:12:18.8230065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8230149Z p_assert( 2022-11-23T03:12:18.8230474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8230594Z traceback.print_stack() 2022-11-23T03:12:18.8230822Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8231056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8231280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8231502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8231626Z File "", line 1, in 2022-11-23T03:12:18.8231812Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8231948Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8232145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8232289Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8232492Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8232592Z self.run() 2022-11-23T03:12:18.8232793Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8233105Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8233447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8233574Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8233921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8234038Z getattr(self, test_name)() 2022-11-23T03:12:18.8234564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8234661Z fn() 2022-11-23T03:12:18.8235021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8235127Z test(self, **param_kwargs) 2022-11-23T03:12:18.8235484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8235661Z return func(*args, **kwargs) 2022-11-23T03:12:18.8235915Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8236030Z self.run_subtests( 2022-11-23T03:12:18.8236381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8236546Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8236905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8237196Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8237742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8237859Z output = model(*input) 2022-11-23T03:12:18.8238191Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8238330Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8238701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8238877Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8239240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8239343Z _lazy_init(state, module) 2022-11-23T03:12:18.8239689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8239830Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8240165Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8240290Z return func(*args, **kwargs) 2022-11-23T03:12:18.8240668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8240769Z p_assert( 2022-11-23T03:12:18.8241102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8241209Z traceback.print_stack() 2022-11-23T03:12:18.8241336Z File "", line 1, in 2022-11-23T03:12:18.8241543Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8241841Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8242037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8242182Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8242385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8242663Z self.run() 2022-11-23T03:12:18.8242855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8243050Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8243395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8243527Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8243887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8244010Z getattr(self, test_name)() 2022-11-23T03:12:18.8244365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8244445Z fn() 2022-11-23T03:12:18.8244806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8244926Z test(self, **param_kwargs) 2022-11-23T03:12:18.8245278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8245627Z return func(*args, **kwargs) 2022-11-23T03:12:18.8245874Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8245983Z self.run_subtests( 2022-11-23T03:12:18.8246324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8246643Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8247004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8247155Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8247528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8247646Z output = model(*input) 2022-11-23T03:12:18.8247973Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8248113Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8248485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8248643Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8249010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8249130Z _lazy_init(state, module) 2022-11-23T03:12:18.8249483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8249626Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8249961Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8250090Z return func(*args, **kwargs) 2022-11-23T03:12:18.8250621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8250704Z p_assert( 2022-11-23T03:12:18.8251028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8251149Z traceback.print_stack() 2022-11-23T03:12:18.8251272Z File "", line 1, in 2022-11-23T03:12:18.8251399Z File "", line 1, in 2022-11-23T03:12:18.8251601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8251737Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8251932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8252060Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8252260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8252400Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8252647Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8252754Z self.run() 2022-11-23T03:12:18.8252947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8253091Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8253267Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8253583Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8253795Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8253898Z self.run() 2022-11-23T03:12:18.8254237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8254370Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8254615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8254762Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8255105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8255228Z getattr(self, test_name)() 2022-11-23T03:12:18.8255565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8255697Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8256055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8256153Z fn() 2022-11-23T03:12:18.8256669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8256786Z getattr(self, test_name)() 2022-11-23T03:12:18.8257120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8257246Z test(self, **param_kwargs) 2022-11-23T03:12:18.8257768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8257864Z fn() 2022-11-23T03:12:18.8258218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8258342Z return func(*args, **kwargs) 2022-11-23T03:12:18.8258698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8258818Z test(self, **param_kwargs) 2022-11-23T03:12:18.8259055Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8259167Z self.run_subtests( 2022-11-23T03:12:18.8259524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8259654Z return func(*args, **kwargs) 2022-11-23T03:12:18.8260001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8260162Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8260416Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8260528Z self.run_subtests( 2022-11-23T03:12:18.8261025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8261170Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8261508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8261662Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8262259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8262385Z output = model(*input) 2022-11-23T03:12:18.8262746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8262896Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8263200Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8263340Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8263712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8263830Z output = model(*input) 2022-11-23T03:12:18.8264424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8264689Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8265017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8265157Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8265502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8265623Z _lazy_init(state, module) 2022-11-23T03:12:18.8265993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8266168Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8266515Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8266657Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8267016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8267143Z _lazy_init(state, module) 2022-11-23T03:12:18.8267461Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8267586Z return func(*args, **kwargs) 2022-11-23T03:12:18.8267931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8268070Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8268445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8268547Z p_assert( 2022-11-23T03:12:18.8268881Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8269003Z return func(*args, **kwargs) 2022-11-23T03:12:18.8269318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8269448Z traceback.print_stack() 2022-11-23T03:12:18.8269826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8269926Z p_assert( 2022-11-23T03:12:18.8270255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8270379Z traceback.print_stack() 2022-11-23T03:12:18.8270615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8270847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8271214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8271442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8271567Z File "", line 1, in 2022-11-23T03:12:18.8271846Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8271995Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8272190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8272334Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8272539Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8272621Z self.run() 2022-11-23T03:12:18.8272814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8272953Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8273286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8273414Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8273763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8273929Z getattr(self, test_name)() 2022-11-23T03:12:18.8274448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8274547Z fn() 2022-11-23T03:12:18.8274910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8275032Z test(self, **param_kwargs) 2022-11-23T03:12:18.8275385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8275508Z return func(*args, **kwargs) 2022-11-23T03:12:18.8275761Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8275873Z self.run_subtests( 2022-11-23T03:12:18.8276209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8276371Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8276732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8276886Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8277259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8277377Z output = model(*input) 2022-11-23T03:12:18.8277702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8277841Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8278349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8278699Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8279072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8279193Z _lazy_init(state, module) 2022-11-23T03:12:18.8279541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8279683Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8280017Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8280141Z return func(*args, **kwargs) 2022-11-23T03:12:18.8280497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8280597Z p_assert( 2022-11-23T03:12:18.8280930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8281055Z traceback.print_stack() 2022-11-23T03:12:18.8281184Z File "", line 1, in 2022-11-23T03:12:18.8281441Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8281590Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8281791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8281922Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8282133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8282236Z self.run() 2022-11-23T03:12:18.8282437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8282581Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8282922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8283056Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8283413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8283569Z getattr(self, test_name)() 2022-11-23T03:12:18.8283927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8284027Z fn() 2022-11-23T03:12:18.8284388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8284509Z test(self, **param_kwargs) 2022-11-23T03:12:18.8284861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8284985Z return func(*args, **kwargs) 2022-11-23T03:12:18.8285221Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8285332Z self.run_subtests( 2022-11-23T03:12:18.8285681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8286007Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8286358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8286506Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8286867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8286981Z output = model(*input) 2022-11-23T03:12:18.8287328Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8287468Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8287832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8288002Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8288548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8288670Z _lazy_init(state, module) 2022-11-23T03:12:18.8289018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8289161Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8289497Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8289602Z return func(*args, **kwargs) 2022-11-23T03:12:18.8289978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8290079Z p_assert( 2022-11-23T03:12:18.8290412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8290536Z traceback.print_stack() 2022-11-23T03:12:18.8290667Z File "", line 1, in 2022-11-23T03:12:18.8290942Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8291074Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8291274Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8291424Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8291634Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8291737Z self.run() 2022-11-23T03:12:18.8291936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8292083Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8292423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8292538Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8293045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8293213Z getattr(self, test_name)() 2022-11-23T03:12:18.8293557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8293653Z fn() 2022-11-23T03:12:18.8294003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8294121Z test(self, **param_kwargs) 2022-11-23T03:12:18.8294462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8294564Z return func(*args, **kwargs) 2022-11-23T03:12:18.8294810Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8294920Z self.run_subtests( 2022-11-23T03:12:18.8295439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8295610Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8295970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8296121Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8296492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8296594Z output = model(*input) 2022-11-23T03:12:18.8296922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8297062Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8297436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8297610Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8297982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8298103Z _lazy_init(state, module) 2022-11-23T03:12:18.8298451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8298575Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8298910Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8299034Z return func(*args, **kwargs) 2022-11-23T03:12:18.8299567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8299667Z p_assert( 2022-11-23T03:12:18.8299989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8300108Z traceback.print_stack() 2022-11-23T03:12:18.8300237Z File "", line 1, in 2022-11-23T03:12:18.8300464Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8300607Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8300801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8300946Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8301149Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8301249Z self.run() 2022-11-23T03:12:18.8301443Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8301568Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8301896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8302199Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8302559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8302733Z getattr(self, test_name)() 2022-11-23T03:12:18.8303089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8303187Z fn() 2022-11-23T03:12:18.8303546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8303650Z test(self, **param_kwargs) 2022-11-23T03:12:18.8304210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8304343Z return func(*args, **kwargs) 2022-11-23T03:12:18.8304599Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8304711Z self.run_subtests( 2022-11-23T03:12:18.8305227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8305391Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8305742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8305872Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8306230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8306343Z output = model(*input) 2022-11-23T03:12:18.8306655Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8306789Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8307149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8307319Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8307677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8307778Z _lazy_init(state, module) 2022-11-23T03:12:18.8308116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8308254Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8308578Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8308697Z return func(*args, **kwargs) 2022-11-23T03:12:18.8309057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8309155Z p_assert( 2022-11-23T03:12:18.8309478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8309586Z traceback.print_stack() 2022-11-23T03:12:18.8309885Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8310125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8310350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8310571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8310695Z File "", line 1, in 2022-11-23T03:12:18.8310899Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8311035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8311212Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8311356Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8311560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8311732Z self.run() 2022-11-23T03:12:18.8311934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8312074Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8312403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8312513Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8312865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8312983Z getattr(self, test_name)() 2022-11-23T03:12:18.8313329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8313425Z fn() 2022-11-23T03:12:18.8313772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8314065Z test(self, **param_kwargs) 2022-11-23T03:12:18.8314431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8314538Z return func(*args, **kwargs) 2022-11-23T03:12:18.8314792Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8314904Z self.run_subtests( 2022-11-23T03:12:18.8315253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8315412Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8315773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8315926Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8316298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8316405Z output = model(*input) 2022-11-23T03:12:18.8316895Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8317032Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8317393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8317562Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8317910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8318026Z _lazy_init(state, module) 2022-11-23T03:12:18.8318362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8318483Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8318988Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8319164Z return func(*args, **kwargs) 2022-11-23T03:12:18.8319549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8319650Z p_assert( 2022-11-23T03:12:18.8319983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8320108Z traceback.print_stack() 2022-11-23T03:12:18.8320238Z File "", line 1, in 2022-11-23T03:12:18.8320429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8320571Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8320772Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8320927Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8321136Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8321291Z self.run() 2022-11-23T03:12:18.8321496Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8321641Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8321964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8322261Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8322791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8322915Z getattr(self, test_name)() 2022-11-23T03:12:18.8323272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8323370Z fn() 2022-11-23T03:12:18.8323731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8323856Z test(self, **param_kwargs) 2022-11-23T03:12:18.8324194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8324322Z return func(*args, **kwargs) 2022-11-23T03:12:18.8324576Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8324689Z self.run_subtests( 2022-11-23T03:12:18.8325039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8325201Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8325714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8325860Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8326199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8326319Z output = model(*input) 2022-11-23T03:12:18.8326634Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8326772Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8327315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8327491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8327852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8327972Z _lazy_init(state, module) 2022-11-23T03:12:18.8328302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8328446Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8328782Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8328955Z return func(*args, **kwargs) 2022-11-23T03:12:18.8329344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8329446Z p_assert( 2022-11-23T03:12:18.8329780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8329929Z traceback.print_stack() 2022-11-23T03:12:18.8330222Z File "", line 1, in 2022-11-23T03:12:18.8330427Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8330566Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8330763Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8330908Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8331293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8331444Z self.run() 2022-11-23T03:12:18.8331634Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8331780Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8332121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8332257Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8332616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8332739Z getattr(self, test_name)() 2022-11-23T03:12:18.8333092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8333189Z fn() 2022-11-23T03:12:18.8333299Z File "", line 1, in 2022-11-23T03:12:18.8333664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8333795Z test(self, **param_kwargs) 2022-11-23T03:12:18.8334149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8334272Z return func(*args, **kwargs) 2022-11-23T03:12:18.8334480Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8334620Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8334857Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8334969Z self.run_subtests( 2022-11-23T03:12:18.8335171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8335321Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8335675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8335842Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8336051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8336153Z self.run() 2022-11-23T03:12:18.8336498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8336654Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8336857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8337003Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8337375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8337495Z output = model(*input) 2022-11-23T03:12:18.8337814Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8337959Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8338362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8338545Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8338881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8339016Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8339379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8339502Z _lazy_init(state, module) 2022-11-23T03:12:18.8339858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8339980Z getattr(self, test_name)() 2022-11-23T03:12:18.8340310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8340504Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8340861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8340958Z fn() 2022-11-23T03:12:18.8341294Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8341417Z return func(*args, **kwargs) 2022-11-23T03:12:18.8341777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8341896Z test(self, **param_kwargs) 2022-11-23T03:12:18.8342253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8342354Z p_assert( 2022-11-23T03:12:18.8342709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8342838Z return func(*args, **kwargs) 2022-11-23T03:12:18.8343174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8343299Z traceback.print_stack() 2022-11-23T03:12:18.8343555Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8343667Z self.run_subtests( 2022-11-23T03:12:18.8344366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8344536Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8344903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8345057Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8345430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8345557Z output = model(*input) 2022-11-23T03:12:18.8345883Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8346024Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8346381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8346558Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8346921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8347043Z _lazy_init(state, module) 2022-11-23T03:12:18.8347391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8347534Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8347940Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8348073Z return func(*args, **kwargs) 2022-11-23T03:12:18.8348436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8348538Z p_assert( 2022-11-23T03:12:18.8348871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8349000Z traceback.print_stack() 2022-11-23T03:12:18.8349238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8349473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8349707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8349937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8350112Z File "", line 1, in 2022-11-23T03:12:18.8350328Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8350471Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8350835Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8350981Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8351184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8351283Z self.run() 2022-11-23T03:12:18.8351484Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8351606Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8351935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8352063Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8352414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8352533Z getattr(self, test_name)() 2022-11-23T03:12:18.8352874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8352969Z fn() 2022-11-23T03:12:18.8353301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8353420Z test(self, **param_kwargs) 2022-11-23T03:12:18.8353761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8353882Z return func(*args, **kwargs) 2022-11-23T03:12:18.8354127Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8354235Z self.run_subtests( 2022-11-23T03:12:18.8354573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8354738Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8355071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8355217Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8355576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8355691Z output = model(*input) 2022-11-23T03:12:18.8356187Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8356328Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8356698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8356873Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8357287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8357399Z _lazy_init(state, module) 2022-11-23T03:12:18.8357750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8357893Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8358228Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8358352Z return func(*args, **kwargs) 2022-11-23T03:12:18.8358728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8358829Z p_assert( 2022-11-23T03:12:18.8359164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8359324Z traceback.print_stack() 2022-11-23T03:12:18.8359459Z File "", line 1, in 2022-11-23T03:12:18.8359667Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8359808Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8360006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8360157Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8360367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8360453Z self.run() 2022-11-23T03:12:18.8360655Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8360799Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8361293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8361420Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8361773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8361892Z getattr(self, test_name)() 2022-11-23T03:12:18.8362421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8362501Z fn() 2022-11-23T03:12:18.8362864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8362986Z test(self, **param_kwargs) 2022-11-23T03:12:18.8363340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8363464Z return func(*args, **kwargs) 2022-11-23T03:12:18.8363717Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8363830Z self.run_subtests( 2022-11-23T03:12:18.8364188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8364332Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8364696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8364847Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8365217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8365593Z output = model(*input) 2022-11-23T03:12:18.8365913Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8366219Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8366594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8366755Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8367168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8367297Z _lazy_init(state, module) 2022-11-23T03:12:18.8367646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8367786Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8368120Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8368243Z return func(*args, **kwargs) 2022-11-23T03:12:18.8368618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8368702Z p_assert( 2022-11-23T03:12:18.8369195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8369363Z traceback.print_stack() 2022-11-23T03:12:18.8369490Z File "", line 1, in 2022-11-23T03:12:18.8369692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8369829Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8370023Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8370149Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8370353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8370452Z self.run() 2022-11-23T03:12:18.8370647Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8370786Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8371112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8371240Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8371783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8371889Z getattr(self, test_name)() 2022-11-23T03:12:18.8372244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8372341Z fn() 2022-11-23T03:12:18.8372702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8372823Z test(self, **param_kwargs) 2022-11-23T03:12:18.8373175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8373300Z return func(*args, **kwargs) 2022-11-23T03:12:18.8373556Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8373651Z self.run_subtests( 2022-11-23T03:12:18.8374010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8374173Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8374685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8374833Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8375193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8375307Z output = model(*input) 2022-11-23T03:12:18.8375619Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8375737Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8376099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8376273Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8376701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8376825Z _lazy_init(state, module) 2022-11-23T03:12:18.8377166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8377303Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8377804Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8377911Z return func(*args, **kwargs) 2022-11-23T03:12:18.8378288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8378391Z p_assert( 2022-11-23T03:12:18.8378727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8378914Z traceback.print_stack() 2022-11-23T03:12:18.8379047Z File "", line 1, in 2022-11-23T03:12:18.8379257Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8379399Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8379584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8379735Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8379945Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8380049Z self.run() 2022-11-23T03:12:18.8380254Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8380400Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8380739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8380858Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8381220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8381343Z getattr(self, test_name)() 2022-11-23T03:12:18.8381697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8381795Z fn() 2022-11-23T03:12:18.8382159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8382282Z test(self, **param_kwargs) 2022-11-23T03:12:18.8382636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8382742Z return func(*args, **kwargs) 2022-11-23T03:12:18.8383330Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8383448Z self.run_subtests( 2022-11-23T03:12:18.8383802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8384167Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8384540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8384692Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8385063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8385164Z output = model(*input) 2022-11-23T03:12:18.8385488Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8385627Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8386007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8386412Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8386778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8386896Z _lazy_init(state, module) 2022-11-23T03:12:18.8387232Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8387401Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8387729Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8387853Z return func(*args, **kwargs) 2022-11-23T03:12:18.8388217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8388316Z p_assert( 2022-11-23T03:12:18.8388823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8389016Z traceback.print_stack() 2022-11-23T03:12:18.8389259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8389479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8389711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8389943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8390073Z File "", line 1, in 2022-11-23T03:12:18.8390284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8390426Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8390627Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8390778Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8390975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8391083Z self.run() 2022-11-23T03:12:18.8391287Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8391436Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8391782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8391916Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8392272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8392395Z getattr(self, test_name)() 2022-11-23T03:12:18.8392733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8392831Z fn() 2022-11-23T03:12:18.8393193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8393321Z test(self, **param_kwargs) 2022-11-23T03:12:18.8393680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8393807Z return func(*args, **kwargs) 2022-11-23T03:12:18.8394224Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8394318Z self.run_subtests( 2022-11-23T03:12:18.8394655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8394813Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8395161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8395308Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8395851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8396026Z output = model(*input) 2022-11-23T03:12:18.8396365Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8396505Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8396860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8397039Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8397401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8397521Z _lazy_init(state, module) 2022-11-23T03:12:18.8397869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8398011Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8398404Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8398530Z return func(*args, **kwargs) 2022-11-23T03:12:18.8398889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8398992Z p_assert( 2022-11-23T03:12:18.8399328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8399454Z traceback.print_stack() 2022-11-23T03:12:18.8399585Z File "", line 1, in 2022-11-23T03:12:18.8399793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8399939Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8400121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8400273Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8400494Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8400597Z self.run() 2022-11-23T03:12:18.8400799Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8400947Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8401287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8401582Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8401912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8402031Z getattr(self, test_name)() 2022-11-23T03:12:18.8402374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8402647Z fn() 2022-11-23T03:12:18.8403012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8403143Z test(self, **param_kwargs) 2022-11-23T03:12:18.8403498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8403624Z return func(*args, **kwargs) 2022-11-23T03:12:18.8403861Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8403974Z self.run_subtests( 2022-11-23T03:12:18.8404323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8404485Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8404847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8404999Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8405421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8405549Z output = model(*input) 2022-11-23T03:12:18.8405860Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8406099Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8406483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8406659Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8407027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8407149Z _lazy_init(state, module) 2022-11-23T03:12:18.8407498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8407691Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8408015Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8408141Z return func(*args, **kwargs) 2022-11-23T03:12:18.8408519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8408621Z p_assert( 2022-11-23T03:12:18.8408955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8409083Z traceback.print_stack() 2022-11-23T03:12:18.8409212Z File "", line 1, in 2022-11-23T03:12:18.8409422Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8409545Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8409747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8409898Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8418124Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8418274Z self.run() 2022-11-23T03:12:18.8418509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8418746Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8419116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8419253Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8419622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8419747Z getattr(self, test_name)() 2022-11-23T03:12:18.8420108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8420207Z fn() 2022-11-23T03:12:18.8420576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8420707Z test(self, **param_kwargs) 2022-11-23T03:12:18.8421049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8421177Z return func(*args, **kwargs) 2022-11-23T03:12:18.8421436Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8421552Z self.run_subtests( 2022-11-23T03:12:18.8421906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8422070Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8422431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8422582Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8423036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8423171Z output = model(*input) 2022-11-23T03:12:18.8423503Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8423649Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8424338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8424532Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8424912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8425038Z _lazy_init(state, module) 2022-11-23T03:12:18.8425390Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8425615Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8425965Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8426096Z return func(*args, **kwargs) 2022-11-23T03:12:18.8426482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8426587Z p_assert( 2022-11-23T03:12:18.8426926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8427052Z traceback.print_stack() 2022-11-23T03:12:18.8427165Z File "", line 1, in 2022-11-23T03:12:18.8427377Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8427522Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8427726Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8427884Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8428102Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8428208Z self.run() 2022-11-23T03:12:18.8428414Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8428546Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8428889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8429024Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8429387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8429512Z getattr(self, test_name)() 2022-11-23T03:12:18.8429870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8429993Z fn() 2022-11-23T03:12:18.8430557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8430660Z test(self, **param_kwargs) 2022-11-23T03:12:18.8431007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8431131Z return func(*args, **kwargs) 2022-11-23T03:12:18.8431381Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8431495Z self.run_subtests( 2022-11-23T03:12:18.8431835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8431996Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8432346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8432477Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8432909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8433035Z output = model(*input) 2022-11-23T03:12:18.8433353Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8433492Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8433854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8434024Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8434377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8434477Z _lazy_init(state, module) 2022-11-23T03:12:18.8435120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8435319Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8435662Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8435789Z return func(*args, **kwargs) 2022-11-23T03:12:18.8436171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8436278Z p_assert( 2022-11-23T03:12:18.8436681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8436793Z traceback.print_stack() 2022-11-23T03:12:18.8437031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8437267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8437505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8437743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8437877Z File "", line 1, in 2022-11-23T03:12:18.8438088Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8438231Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8438415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8438567Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8438781Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8438886Z self.run() 2022-11-23T03:12:18.8439091Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8439238Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8439585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8439704Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8440071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8440196Z getattr(self, test_name)() 2022-11-23T03:12:18.8440554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8440655Z fn() 2022-11-23T03:12:18.8441016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8441141Z test(self, **param_kwargs) 2022-11-23T03:12:18.8441494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8441600Z return func(*args, **kwargs) 2022-11-23T03:12:18.8441859Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8441979Z self.run_subtests( 2022-11-23T03:12:18.8442540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8442735Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8443132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8443282Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8443645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8443743Z output = model(*input) 2022-11-23T03:12:18.8444059Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8444377Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8444753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8444991Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8445359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8445481Z _lazy_init(state, module) 2022-11-23T03:12:18.8445833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8445959Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8446297Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8446423Z return func(*args, **kwargs) 2022-11-23T03:12:18.8446803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8446910Z p_assert( 2022-11-23T03:12:18.8447244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8447379Z traceback.print_stack() 2022-11-23T03:12:18.8447510Z File "", line 1, in 2022-11-23T03:12:18.8447701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8447846Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8448052Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8448202Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8448416Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8448521Z self.run() 2022-11-23T03:12:18.8448723Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8448848Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8449187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8449328Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8449689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8449814Z getattr(self, test_name)() 2022-11-23T03:12:18.8450171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8450270Z fn() 2022-11-23T03:12:18.8450631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8450736Z test(self, **param_kwargs) 2022-11-23T03:12:18.8451090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8451215Z return func(*args, **kwargs) 2022-11-23T03:12:18.8451471Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8451747Z self.run_subtests( 2022-11-23T03:12:18.8452137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8452302Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8452654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8452782Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8453141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8453259Z output = model(*input) 2022-11-23T03:12:18.8453572Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8453707Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8454067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8454305Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8454666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8454766Z _lazy_init(state, module) 2022-11-23T03:12:18.8455108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8455249Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8455574Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8455695Z return func(*args, **kwargs) 2022-11-23T03:12:18.8456059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8456160Z p_assert( 2022-11-23T03:12:18.8456486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8456601Z traceback.print_stack() 2022-11-23T03:12:18.8456728Z File "", line 1, in 2022-11-23T03:12:18.8456931Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8457069Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8457265Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8457413Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8457617Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8457719Z self.run() 2022-11-23T03:12:18.8457897Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8458038Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8458551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8458690Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8459055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8459181Z getattr(self, test_name)() 2022-11-23T03:12:18.8459540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8459619Z fn() 2022-11-23T03:12:18.8459987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8460112Z test(self, **param_kwargs) 2022-11-23T03:12:18.8460467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8460592Z return func(*args, **kwargs) 2022-11-23T03:12:18.8460848Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8460967Z self.run_subtests( 2022-11-23T03:12:18.8461526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8461672Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8462024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8462174Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8462715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8462837Z output = model(*input) 2022-11-23T03:12:18.8463162Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8463305Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8463678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8464132Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8464502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8464626Z _lazy_init(state, module) 2022-11-23T03:12:18.8464978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8465120Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8465459Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8465585Z return func(*args, **kwargs) 2022-11-23T03:12:18.8466111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8466210Z p_assert( 2022-11-23T03:12:18.8466702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8466832Z traceback.print_stack() 2022-11-23T03:12:18.8466965Z File "", line 1, in 2022-11-23T03:12:18.8467174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8467319Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8467522Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8467675Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8467869Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8467973Z self.run() 2022-11-23T03:12:18.8468176Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8468325Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8468663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8468802Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8469161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8469286Z getattr(self, test_name)() 2022-11-23T03:12:18.8469777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8469872Z fn() 2022-11-23T03:12:18.8470224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8470345Z test(self, **param_kwargs) 2022-11-23T03:12:18.8470683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8470804Z return func(*args, **kwargs) 2022-11-23T03:12:18.8471050Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8471234Z self.run_subtests( 2022-11-23T03:12:18.8471568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8471726Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8472076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8472225Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8472587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8472704Z output = model(*input) 2022-11-23T03:12:18.8473019Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8473158Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8473508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8473748Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8474198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8474319Z _lazy_init(state, module) 2022-11-23T03:12:18.8474660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8474802Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8475129Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8475250Z return func(*args, **kwargs) 2022-11-23T03:12:18.8475596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8475962Z p_assert( 2022-11-23T03:12:18.8476305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8476558Z traceback.print_stack() 2022-11-23T03:12:18.8476797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8477034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8477267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8477517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8477630Z File "", line 1, in 2022-11-23T03:12:18.8477840Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8477983Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8478187Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8478342Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8478558Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8478825Z self.run() 2022-11-23T03:12:18.8479002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8479322Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8479665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8479802Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8480166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8480293Z getattr(self, test_name)() 2022-11-23T03:12:18.8480652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8480755Z fn() 2022-11-23T03:12:18.8481156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8481289Z test(self, **param_kwargs) 2022-11-23T03:12:18.8481647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8481774Z return func(*args, **kwargs) 2022-11-23T03:12:18.8482031Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8482147Z self.run_subtests( 2022-11-23T03:12:18.8482500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8482664Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8483008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8483323Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8483948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8484071Z output = model(*input) 2022-11-23T03:12:18.8484396Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8484540Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8484914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8485092Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8485437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8485559Z _lazy_init(state, module) 2022-11-23T03:12:18.8485908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8486056Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8486395Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8486521Z return func(*args, **kwargs) 2022-11-23T03:12:18.8486897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8487000Z p_assert( 2022-11-23T03:12:18.8487319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8487507Z traceback.print_stack() 2022-11-23T03:12:18.8487640Z File "", line 1, in 2022-11-23T03:12:18.8487853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8487998Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8488201Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8488357Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8488573Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8488660Z self.run() 2022-11-23T03:12:18.8488863Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8489011Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8489352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8489486Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8489848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8489977Z getattr(self, test_name)() 2022-11-23T03:12:18.8490316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8490419Z fn() 2022-11-23T03:12:18.8490833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8490964Z test(self, **param_kwargs) 2022-11-23T03:12:18.8491322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8491452Z return func(*args, **kwargs) 2022-11-23T03:12:18.8491711Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8491827Z self.run_subtests( 2022-11-23T03:12:18.8492161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8492324Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8492689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8492893Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8493430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8493547Z output = model(*input) 2022-11-23T03:12:18.8493867Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8494006Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8494348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8494519Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8494874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8494993Z _lazy_init(state, module) 2022-11-23T03:12:18.8495330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8495476Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8495805Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8496104Z return func(*args, **kwargs) 2022-11-23T03:12:18.8496486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8496570Z p_assert( 2022-11-23T03:12:18.8496907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8497035Z traceback.print_stack() 2022-11-23T03:12:18.8497166Z File "", line 1, in 2022-11-23T03:12:18.8497296Z File "", line 1, in 2022-11-23T03:12:18.8497578Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8497722Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8497915Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8498067Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8498276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8498419Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8498632Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8498740Z self.run() 2022-11-23T03:12:18.8498941Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8499073Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8499278Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8499426Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8499639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8499748Z self.run() 2022-11-23T03:12:18.8500285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8500423Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8500619Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8500741Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8501092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8501216Z getattr(self, test_name)() 2022-11-23T03:12:18.8501541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8501672Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8502207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8502308Z fn() 2022-11-23T03:12:18.8502723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8502829Z getattr(self, test_name)() 2022-11-23T03:12:18.8503197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8503322Z test(self, **param_kwargs) 2022-11-23T03:12:18.8503674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8503774Z fn() 2022-11-23T03:12:18.8504352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8504483Z return func(*args, **kwargs) 2022-11-23T03:12:18.8504848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8504953Z test(self, **param_kwargs) 2022-11-23T03:12:18.8505216Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8505332Z self.run_subtests( 2022-11-23T03:12:18.8505691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8505816Z return func(*args, **kwargs) 2022-11-23T03:12:18.8506164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8506331Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8506589Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8506685Z self.run_subtests( 2022-11-23T03:12:18.8507045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8507359Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8507707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8507863Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8508223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8508342Z output = model(*input) 2022-11-23T03:12:18.8508690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8508819Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8509131Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8509268Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8509629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8509748Z output = model(*input) 2022-11-23T03:12:18.8510180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8510363Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8510681Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8510797Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8511149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8511270Z _lazy_init(state, module) 2022-11-23T03:12:18.8511633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8511802Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8512205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8512349Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8512699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8512798Z _lazy_init(state, module) 2022-11-23T03:12:18.8513127Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8513249Z return func(*args, **kwargs) 2022-11-23T03:12:18.8513582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8513719Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8514082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8514183Z p_assert( 2022-11-23T03:12:18.8514514Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8514619Z return func(*args, **kwargs) 2022-11-23T03:12:18.8515126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8515253Z traceback.print_stack() 2022-11-23T03:12:18.8515630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8515733Z p_assert( 2022-11-23T03:12:18.8516061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8516189Z traceback.print_stack() 2022-11-23T03:12:18.8516426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8516642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8516877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8517119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8517251Z File "", line 1, in 2022-11-23T03:12:18.8517462Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8517605Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8517967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8518113Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8518299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8518401Z self.run() 2022-11-23T03:12:18.8518597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8518744Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8519078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8519260Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8519798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8519905Z getattr(self, test_name)() 2022-11-23T03:12:18.8520265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8520364Z fn() 2022-11-23T03:12:18.8520726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8520850Z test(self, **param_kwargs) 2022-11-23T03:12:18.8521205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8521331Z return func(*args, **kwargs) 2022-11-23T03:12:18.8521587Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8521748Z self.run_subtests( 2022-11-23T03:12:18.8522102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8522265Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8522627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8522945Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8523487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8523609Z output = model(*input) 2022-11-23T03:12:18.8523936Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8524058Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8524444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8524621Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8524989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8525114Z _lazy_init(state, module) 2022-11-23T03:12:18.8525462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8525605Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8525945Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8526212Z return func(*args, **kwargs) 2022-11-23T03:12:18.8526578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8526681Z p_assert( 2022-11-23T03:12:18.8527011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8527134Z traceback.print_stack() 2022-11-23T03:12:18.8527261Z File "", line 1, in 2022-11-23T03:12:18.8527462Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8527602Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8527960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8528113Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8528326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8528430Z self.run() 2022-11-23T03:12:18.8528632Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8528780Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8529125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8529306Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8529656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8529783Z getattr(self, test_name)() 2022-11-23T03:12:18.8530195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8530298Z fn() 2022-11-23T03:12:18.8530664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8530951Z test(self, **param_kwargs) 2022-11-23T03:12:18.8531296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8531399Z return func(*args, **kwargs) 2022-11-23T03:12:18.8531648Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8531851Z self.run_subtests( 2022-11-23T03:12:18.8532193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8532352Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8532702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8532853Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8533212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8533332Z output = model(*input) 2022-11-23T03:12:18.8533625Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8533762Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8534131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8534300Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8534654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8534774Z _lazy_init(state, module) 2022-11-23T03:12:18.8535112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8535420Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8535743Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8535947Z return func(*args, **kwargs) 2022-11-23T03:12:18.8536325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8536434Z p_assert( 2022-11-23T03:12:18.8536772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8536900Z traceback.print_stack() 2022-11-23T03:12:18.8537031Z File "", line 1, in 2022-11-23T03:12:18.8537221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8537365Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8537570Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8537721Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8537932Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8538038Z self.run() 2022-11-23T03:12:18.8538240Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8538385Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8538759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8538903Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8539267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8539391Z getattr(self, test_name)() 2022-11-23T03:12:18.8539748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8539846Z fn() 2022-11-23T03:12:18.8540209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8540335Z test(self, **param_kwargs) 2022-11-23T03:12:18.8540672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8540797Z return func(*args, **kwargs) 2022-11-23T03:12:18.8541102Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8541218Z self.run_subtests( 2022-11-23T03:12:18.8541570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8541733Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8542097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8542252Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8542608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8542728Z output = model(*input) 2022-11-23T03:12:18.8543051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8543191Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8543570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8543747Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8544349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8544477Z _lazy_init(state, module) 2022-11-23T03:12:18.8544816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8544963Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8545301Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8545592Z return func(*args, **kwargs) 2022-11-23T03:12:18.8545960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8546065Z p_assert( 2022-11-23T03:12:18.8546575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8546704Z traceback.print_stack() 2022-11-23T03:12:18.8546820Z File "", line 1, in 2022-11-23T03:12:18.8547031Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8547176Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8547382Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8547535Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8547749Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8547857Z self.run() 2022-11-23T03:12:18.8548040Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8548186Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8548600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8548746Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8549108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8549235Z getattr(self, test_name)() 2022-11-23T03:12:18.8549593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8549695Z fn() 2022-11-23T03:12:18.8550038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8550169Z test(self, **param_kwargs) 2022-11-23T03:12:18.8550521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8550646Z return func(*args, **kwargs) 2022-11-23T03:12:18.8550971Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T03:12:18.8551089Z self.run_subtests( 2022-11-23T03:12:18.8551599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8551758Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8552091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8552239Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8552602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8552721Z output = model(*input) 2022-11-23T03:12:18.8553037Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8553178Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8553544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8553714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8554236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8554365Z _lazy_init(state, module) 2022-11-23T03:12:18.8554720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T03:12:18.8554867Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8555203Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8555331Z return func(*args, **kwargs) 2022-11-23T03:12:18.8555708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8555817Z p_assert( 2022-11-23T03:12:18.8556137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8556264Z traceback.print_stack() 2022-11-23T03:12:18.8556502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8556739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8557135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8557362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8557761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8557989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8558202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8558488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8558726Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8558954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8559181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8559412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8559644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8559870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8560099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8560354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8560585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8560813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8561042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8561271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8561498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8561886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8562105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8562303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8562529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8562747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8563145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8563371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8563597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8563824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8564048Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8564252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8564478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8564711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8564937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8565162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8565388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8565613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8565837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8566062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8566268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8566492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8566761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8566995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8567221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8567443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8567667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8567890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8568097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8568319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8568591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8568820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8569045Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8569272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8569494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8569720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8569924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8570148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8570376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8570608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8570834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8571055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8571277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8571658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8571875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8572072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8572288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8572504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8572727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8572942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8573158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8573374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8573590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8573785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8574001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8574219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8574616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8574886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8575120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8575347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8575574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8575777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8576001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8576226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8576450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8576720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8576956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8577180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8577403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8577625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8577830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8578052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8578439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8578658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8578880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8579097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8579206Z dist init r=0, world=4 2022-11-23T03:12:18.8579703Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8580005Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8580316Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8580622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8580934Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8581237Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8581539Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8581833Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8582121Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8582456Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8582752Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8583040Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.8583144Z dist init r=2, world=4 2022-11-23T03:12:18.8583450Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8583754Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8584318Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8584622Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8584918Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8585211Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8585503Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8585797Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8586093Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8586385Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8586673Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8586962Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.8587059Z dist init r=1, world=4 2022-11-23T03:12:18.8587574Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8587882Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8588172Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8588456Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8588738Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8589203Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8589565Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8589866Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8590186Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8590481Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8590762Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8591060Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.8591219Z dist init r=3, world=4 2022-11-23T03:12:18.8591531Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8591834Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8592130Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8592425Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8592718Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8593014Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8593305Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8593756Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8594041Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8594314Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8594600Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8594879Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.8594966Z ok (7.225s) 2022-11-23T03:12:18.8595279Z test_nested_wrapped_model_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24366 2022-11-23T03:12:18.8595479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24367 2022-11-23T03:12:18.8595676Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24368 2022-11-23T03:12:18.8595873Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24369 2022-11-23T03:12:18.8596465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8596631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8597005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8597185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8597543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8597705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8598069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8598354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8598707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8598913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8599284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8599463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8599823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8599988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8600508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8600684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8600912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.8601144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.8601359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.8601582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.8601957Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8602326Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8602686Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8603043Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8603259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.8603652Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.8603869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.8604077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.8604298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8604520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8604739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8604958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8606010Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8606125Z warnings.warn( 2022-11-23T03:12:18.8607269Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8607368Z warnings.warn( 2022-11-23T03:12:18.8608385Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8608485Z warnings.warn( 2022-11-23T03:12:18.8609444Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8609543Z warnings.warn( 2022-11-23T03:12:18.8609759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8609967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8610181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8610393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8610600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8610807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8611014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8611221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8611425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8611633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8611847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8612056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8612259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8612463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8612667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8612876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8613085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8613286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8613540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8613759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8613967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8614172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8614374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8614582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8614789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8614992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8615418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8615635Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8615846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8616057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8616268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8616482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8616691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8616903Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8617108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8617322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8617534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8617744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8617955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8618323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8618528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8618731Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8618928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8619317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8619536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8619756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8619968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8620179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8620394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8620605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8620816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8621021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8621236Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8621489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8621711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8621921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8622135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8622345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8622555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8622759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8622976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8623401Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8623782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8624183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8624406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8624621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8624838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8625133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8625343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8625555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8625778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8625994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8626207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8626416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8626630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8626843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8627047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8627258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8627471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8627691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8627904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8628116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8628326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8628539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8629299Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.8630138Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.8630894Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.8631621Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.8631845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8632131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8632353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8632571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8632787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8633004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8633222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8633438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8633533Z dist init r=3, world=4 2022-11-23T03:12:18.8633634Z dist init r=1, world=4 2022-11-23T03:12:18.8633739Z dist init r=2, world=4 2022-11-23T03:12:18.8633836Z dist init r=0, world=4 2022-11-23T03:12:18.8633928Z ok (6.122s) 2022-11-23T03:12:18.8634251Z test_nested_wrapped_model_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24667 2022-11-23T03:12:18.8634465Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24668 2022-11-23T03:12:18.8634676Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24669 2022-11-23T03:12:18.8634875Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24670 2022-11-23T03:12:18.8635245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8635414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8635781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8635964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8636322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8636488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8636854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8637024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8637371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8637532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8637894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8638075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8638474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8638649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8639014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8639193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8639418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.8639653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.8639885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.8640119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.8640574Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8640957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8641331Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8641791Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8642010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.8642222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.8642434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.8642653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.8642883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8643107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8643326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8643543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8644553Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8644659Z warnings.warn( 2022-11-23T03:12:18.8645663Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8645766Z warnings.warn( 2022-11-23T03:12:18.8646764Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8646917Z warnings.warn( 2022-11-23T03:12:18.8647925Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8648028Z warnings.warn( 2022-11-23T03:12:18.8648248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8648472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8648691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8648963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8649181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8649401Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8649618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8649829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8650044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8650257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8650471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8650691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8650911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8651126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8651341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8651552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8651765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8651978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8652187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8652400Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8652772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8652985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8653187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8653386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8653592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8653801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8654005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8654211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8654414Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8654624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8654887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8655102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8655298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8655505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8655708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8655911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8656112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8656502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8656767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8656982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8657186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8657403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8657620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8657833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8658042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8658259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8658469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8658686Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8658889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8659102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8659314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8659532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8659741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8659954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8660162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8660373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8660591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8660797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8661007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8661218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8661431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8661644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8662011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8662216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8662422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8662696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8662911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8663115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8663492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8663709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8664128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8664355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8664568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8664855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8665060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8665276Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8665488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8665800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8666011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8666225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8666435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8666648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8666860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8667074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8667283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8667494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8667707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8667922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8668131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8668348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8668550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8668773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8668875Z dist init r=3, world=4 2022-11-23T03:12:18.8668972Z dist init r=0, world=4 2022-11-23T03:12:18.8669071Z dist init r=2, world=4 2022-11-23T03:12:18.8669170Z dist init r=1, world=4 2022-11-23T03:12:18.8669258Z ok (6.523s) 2022-11-23T03:12:18.8669584Z test_nested_wrapped_model_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24968 2022-11-23T03:12:18.8669947Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24969 2022-11-23T03:12:18.8670146Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24970 2022-11-23T03:12:18.8670341Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24971 2022-11-23T03:12:18.8670702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8670935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8671302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8671480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8672003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8672161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8672525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8672703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8673054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8673272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8673635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8673813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8674166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8674331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8674848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8675019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8675245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.8675469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.8675698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.8675918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.8676290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8676662Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8677024Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8677376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8677586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.8677801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.8678193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.8678407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.8678630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8678851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8679070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8679283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8680337Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8680448Z warnings.warn( 2022-11-23T03:12:18.8681448Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8681548Z warnings.warn( 2022-11-23T03:12:18.8682538Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8682686Z warnings.warn( 2022-11-23T03:12:18.8683681Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8683941Z warnings.warn( 2022-11-23T03:12:18.8684154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8684555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8684775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8684995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8685208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8685418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8685633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8685849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8686061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8686283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8686504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8686720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8686937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8687142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8687514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8687772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8687981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8688184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8688447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8688659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8688866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8689063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8689450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8689662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8689877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8690087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8690299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8690561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8690773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8690984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8691192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8691402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8691618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8691828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8692039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8692249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8692468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8692677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8692883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8693094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8693307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8693516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8693727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8693939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8694156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8694369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8694729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8694935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8695140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8695343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8695548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8695751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8695958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8696204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8696417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8696785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8697001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8697213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8697424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8697633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8697844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8698055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8698332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8698537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8698756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8698966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8699176Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8699393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8699606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8699817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8700026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8700236Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8700452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8700663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8700875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8701085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8701297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8701508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8701879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8702088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8702288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8702494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8702699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8703082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8703294Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8703505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8703717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8704143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8704428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8704652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8704864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8705078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8705181Z dist init r=3, world=4 2022-11-23T03:12:18.8705282Z dist init r=0, world=4 2022-11-23T03:12:18.8705380Z dist init r=2, world=4 2022-11-23T03:12:18.8705477Z dist init r=1, world=4 2022-11-23T03:12:18.8705561Z ok (6.322s) 2022-11-23T03:12:18.8705880Z test_nested_wrapped_model_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25269 2022-11-23T03:12:18.8706089Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25270 2022-11-23T03:12:18.8706362Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25271 2022-11-23T03:12:18.8706565Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25272 2022-11-23T03:12:18.8706938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8707105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8707474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8707649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8708001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8708165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8708530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8708715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8709069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8709394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8709742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8710081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8710434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.8710596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.8710956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.8711142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.8711380Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.8711610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.8711839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.8712065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.8712446Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8712827Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8713205Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8713626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.8713851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.8714066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.8714283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.8714497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.8714720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8714933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8715155Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8715422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8716763Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8716864Z warnings.warn( 2022-11-23T03:12:18.8717862Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8717968Z warnings.warn( 2022-11-23T03:12:18.8719107Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8719203Z warnings.warn( 2022-11-23T03:12:18.8720163Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.8720440Z warnings.warn( 2022-11-23T03:12:18.8720561Z File "", line 1, in 2022-11-23T03:12:18.8720766Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8720892Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8721088Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8721225Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8721431Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8721524Z self.run() 2022-11-23T03:12:18.8721714Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8721849Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8722234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8722357Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8722712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8722825Z getattr(self, test_name)() 2022-11-23T03:12:18.8723172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8723261Z fn() 2022-11-23T03:12:18.8723613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8723724Z test(self, **param_kwargs) 2022-11-23T03:12:18.8724071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8724179Z return func(*args, **kwargs) 2022-11-23T03:12:18.8724477Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8724579Z self.run_subtests( 2022-11-23T03:12:18.8724922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8725073Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8725428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8725569Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8725932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8726034Z output = model(*input) 2022-11-23T03:12:18.8726350Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8726487Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8727011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8727170Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8727512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8727620Z _lazy_init(state, module) 2022-11-23T03:12:18.8727946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8728067Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8728568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8728681Z return func(*args, **kwargs) 2022-11-23T03:12:18.8729050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8729146Z p_assert( 2022-11-23T03:12:18.8729473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8729587Z traceback.print_stack() 2022-11-23T03:12:18.8729697Z File "", line 1, in 2022-11-23T03:12:18.8729894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8730024Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8730268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8730410Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8730610Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8730703Z self.run() 2022-11-23T03:12:18.8730896Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8731024Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8731564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8731690Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8732036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8732145Z getattr(self, test_name)() 2022-11-23T03:12:18.8732483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8732566Z fn() 2022-11-23T03:12:18.8732907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8733008Z test(self, **param_kwargs) 2022-11-23T03:12:18.8733338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8733450Z return func(*args, **kwargs) 2022-11-23T03:12:18.8733743Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8733842Z self.run_subtests( 2022-11-23T03:12:18.8734171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8734316Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8734657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8734788Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8735138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8735244Z output = model(*input) 2022-11-23T03:12:18.8735548Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8735678Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8736203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8736370Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8736728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8736831Z _lazy_init(state, module) 2022-11-23T03:12:18.8737168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8737303Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8737629Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8737742Z return func(*args, **kwargs) 2022-11-23T03:12:18.8738111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8738205Z p_assert( 2022-11-23T03:12:18.8738531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8738640Z traceback.print_stack() 2022-11-23T03:12:18.8738758Z File "", line 1, in 2022-11-23T03:12:18.8738877Z File "", line 1, in 2022-11-23T03:12:18.8739077Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8739207Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8739398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8739536Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8739727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8739861Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8740067Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8740210Z self.run() 2022-11-23T03:12:18.8740408Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8740547Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8740739Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8740875Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8741068Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8741160Z self.run() 2022-11-23T03:12:18.8741490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8741610Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8741798Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8741934Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8742336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8742442Z getattr(self, test_name)() 2022-11-23T03:12:18.8742811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8742955Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8743306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8743554Z fn() 2022-11-23T03:12:18.8744266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8744391Z getattr(self, test_name)() 2022-11-23T03:12:18.8744750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8744855Z test(self, **param_kwargs) 2022-11-23T03:12:18.8745203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8745288Z fn() 2022-11-23T03:12:18.8745631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8745744Z return func(*args, **kwargs) 2022-11-23T03:12:18.8746091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8746202Z test(self, **param_kwargs) 2022-11-23T03:12:18.8746444Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8746539Z self.run_subtests( 2022-11-23T03:12:18.8746882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8746996Z return func(*args, **kwargs) 2022-11-23T03:12:18.8747343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8747658Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8748066Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8748169Z self.run_subtests( 2022-11-23T03:12:18.8748525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8748660Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8748997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8749146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8749507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8749619Z output = model(*input) 2022-11-23T03:12:18.8750043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8750199Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8750514Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8750637Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8750998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8751105Z output = model(*input) 2022-11-23T03:12:18.8751467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8751634Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8751948Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8752298Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8752644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8752744Z _lazy_init(state, module) 2022-11-23T03:12:18.8753093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8753253Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8753580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8753707Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8754045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8754151Z _lazy_init(state, module) 2022-11-23T03:12:18.8754467Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8754579Z return func(*args, **kwargs) 2022-11-23T03:12:18.8754900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8755024Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8755376Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8755465Z p_assert( 2022-11-23T03:12:18.8755779Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8755891Z return func(*args, **kwargs) 2022-11-23T03:12:18.8756201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8756304Z traceback.print_stack() 2022-11-23T03:12:18.8756656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8756754Z p_assert( 2022-11-23T03:12:18.8757064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8757173Z traceback.print_stack() 2022-11-23T03:12:18.8757393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8757612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8757826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8758032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8758152Z File "", line 1, in 2022-11-23T03:12:18.8758344Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8758473Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8758708Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8758850Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8759044Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8759135Z self.run() 2022-11-23T03:12:18.8759488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8759625Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8759959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8760079Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8760428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8760539Z getattr(self, test_name)() 2022-11-23T03:12:18.8760884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8761036Z fn() 2022-11-23T03:12:18.8761384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8761496Z test(self, **param_kwargs) 2022-11-23T03:12:18.8761841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8761953Z return func(*args, **kwargs) 2022-11-23T03:12:18.8762194Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8762445Z self.run_subtests( 2022-11-23T03:12:18.8762773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8762919Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8763249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8763393Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8763926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8764037Z output = model(*input) 2022-11-23T03:12:18.8764355Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8764487Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8764851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8765018Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8765365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8765475Z _lazy_init(state, module) 2022-11-23T03:12:18.8765819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8765951Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8766272Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8766384Z return func(*args, **kwargs) 2022-11-23T03:12:18.8766912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8767003Z p_assert( 2022-11-23T03:12:18.8767553Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8767670Z traceback.print_stack() 2022-11-23T03:12:18.8767789Z File "", line 1, in 2022-11-23T03:12:18.8767985Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8768119Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8768355Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8768503Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8768696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8768789Z self.run() 2022-11-23T03:12:18.8768978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8769114Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8769445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8769567Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8769915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8770027Z getattr(self, test_name)() 2022-11-23T03:12:18.8770519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8770655Z fn() 2022-11-23T03:12:18.8770998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8771106Z test(self, **param_kwargs) 2022-11-23T03:12:18.8771437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8771547Z return func(*args, **kwargs) 2022-11-23T03:12:18.8771780Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8771881Z self.run_subtests( 2022-11-23T03:12:18.8772200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8772344Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8772687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8772826Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8773176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8773281Z output = model(*input) 2022-11-23T03:12:18.8773587Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8773711Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8774052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8774211Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8774551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8774662Z _lazy_init(state, module) 2022-11-23T03:12:18.8774992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8775120Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8775234Z File "", line 1, in 2022-11-23T03:12:18.8775546Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8775649Z return func(*args, **kwargs) 2022-11-23T03:12:18.8776001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8776267Z p_assert( 2022-11-23T03:12:18.8776470Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8776600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8776926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8777046Z traceback.print_stack() 2022-11-23T03:12:18.8777284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8777424Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8777625Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8777717Z self.run() 2022-11-23T03:12:18.8777908Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8778042Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8778367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8778489Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8778834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8778948Z getattr(self, test_name)() 2022-11-23T03:12:18.8779500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8779584Z fn() 2022-11-23T03:12:18.8779923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8780202Z test(self, **param_kwargs) 2022-11-23T03:12:18.8780546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8780663Z return func(*args, **kwargs) 2022-11-23T03:12:18.8780897Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8781002Z self.run_subtests( 2022-11-23T03:12:18.8781344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8781499Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8781853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8781994Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8782359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8782468Z output = model(*input) 2022-11-23T03:12:18.8782774Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8782906Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8783274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8783445Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8783798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8784271Z _lazy_init(state, module) 2022-11-23T03:12:18.8784798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8784934Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8785251Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8785367Z return func(*args, **kwargs) 2022-11-23T03:12:18.8785734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8785825Z p_assert( 2022-11-23T03:12:18.8786151Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8786265Z traceback.print_stack() 2022-11-23T03:12:18.8786384Z File "", line 1, in 2022-11-23T03:12:18.8786580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8786708Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8786967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8787116Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8787316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8787408Z self.run() 2022-11-23T03:12:18.8787649Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8787787Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8788110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8788236Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8788585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8788696Z getattr(self, test_name)() 2022-11-23T03:12:18.8789114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8789200Z fn() 2022-11-23T03:12:18.8789553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8789666Z test(self, **param_kwargs) 2022-11-23T03:12:18.8790000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8790114Z return func(*args, **kwargs) 2022-11-23T03:12:18.8790356Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8790456Z self.run_subtests( 2022-11-23T03:12:18.8790796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8790948Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8791305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8791446Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8791802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8791912Z output = model(*input) 2022-11-23T03:12:18.8792228Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8792358Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8792720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8792883Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8793234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8793347Z _lazy_init(state, module) 2022-11-23T03:12:18.8793682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8793815Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8794297Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8794410Z return func(*args, **kwargs) 2022-11-23T03:12:18.8794767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8794855Z p_assert( 2022-11-23T03:12:18.8795166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8795278Z traceback.print_stack() 2022-11-23T03:12:18.8795488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8795708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8795967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8796189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8796305Z File "", line 1, in 2022-11-23T03:12:18.8796497Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8796625Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8796807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8796936Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8797313Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8797407Z self.run() 2022-11-23T03:12:18.8797600Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8797813Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8798150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8798273Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8798625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8798730Z getattr(self, test_name)() 2022-11-23T03:12:18.8799078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8799165Z fn() 2022-11-23T03:12:18.8799517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8799628Z test(self, **param_kwargs) 2022-11-23T03:12:18.8799971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8800087Z return func(*args, **kwargs) 2022-11-23T03:12:18.8800331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8800428Z self.run_subtests( 2022-11-23T03:12:18.8800769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8801075Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8801414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8801550Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8801901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8802007Z output = model(*input) 2022-11-23T03:12:18.8802496Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8802624Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8802991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8803158Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8803510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8803620Z _lazy_init(state, module) 2022-11-23T03:12:18.8803955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8804086Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8804411Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8804518Z return func(*args, **kwargs) 2022-11-23T03:12:18.8804883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8805025Z p_assert( 2022-11-23T03:12:18.8805359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8805472Z traceback.print_stack() 2022-11-23T03:12:18.8805590Z File "", line 1, in 2022-11-23T03:12:18.8805786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8805908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8806100Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8806239Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8806439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8806533Z self.run() 2022-11-23T03:12:18.8806724Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8807070Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8807389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8807500Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8807842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8807953Z getattr(self, test_name)() 2022-11-23T03:12:18.8808288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8808371Z fn() 2022-11-23T03:12:18.8808712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8808819Z test(self, **param_kwargs) 2022-11-23T03:12:18.8809150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8809257Z return func(*args, **kwargs) 2022-11-23T03:12:18.8809491Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8809590Z self.run_subtests( 2022-11-23T03:12:18.8809914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8810060Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8810398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8810533Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8810881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8810978Z output = model(*input) 2022-11-23T03:12:18.8811281Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8811412Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8811763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8811920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8812262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8812368Z _lazy_init(state, module) 2022-11-23T03:12:18.8812692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8812812Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8813125Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8813235Z return func(*args, **kwargs) 2022-11-23T03:12:18.8813633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8813731Z p_assert( 2022-11-23T03:12:18.8814048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8814160Z traceback.print_stack() 2022-11-23T03:12:18.8814275Z File "", line 1, in 2022-11-23T03:12:18.8814459Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8814584Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8814767Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8814906Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8815100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8815191Z self.run() 2022-11-23T03:12:18.8815374Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8815543Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8815866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8815983Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8816507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8816618Z getattr(self, test_name)() 2022-11-23T03:12:18.8816966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8817056Z fn() 2022-11-23T03:12:18.8817410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8817515Z test(self, **param_kwargs) 2022-11-23T03:12:18.8817859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8817976Z return func(*args, **kwargs) 2022-11-23T03:12:18.8818218Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8818321Z self.run_subtests( 2022-11-23T03:12:18.8818660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8818810Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8819315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8819443Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8819793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8819898Z output = model(*input) 2022-11-23T03:12:18.8820204Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8820336Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8820869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8821034Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8821395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8821499Z _lazy_init(state, module) 2022-11-23T03:12:18.8821838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8821970Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8822293Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8822406Z return func(*args, **kwargs) 2022-11-23T03:12:18.8822824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8822923Z p_assert( 2022-11-23T03:12:18.8823247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8823354Z traceback.print_stack() 2022-11-23T03:12:18.8823472Z File "", line 1, in 2022-11-23T03:12:18.8823668Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8823798Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8824537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8824685Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8824887Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8824980Z self.run() 2022-11-23T03:12:18.8825165Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8825404Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8825742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8825866Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8826215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8826326Z getattr(self, test_name)() 2022-11-23T03:12:18.8826671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8826749Z fn() 2022-11-23T03:12:18.8827102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8827372Z test(self, **param_kwargs) 2022-11-23T03:12:18.8827704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8827821Z return func(*args, **kwargs) 2022-11-23T03:12:18.8828054Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8828155Z self.run_subtests( 2022-11-23T03:12:18.8828651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8828795Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8829152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8829297Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8829662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8829770Z output = model(*input) 2022-11-23T03:12:18.8830082Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8830270Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8830642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8830798Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8831153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8831265Z _lazy_init(state, module) 2022-11-23T03:12:18.8831766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8831894Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8832211Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8832319Z return func(*args, **kwargs) 2022-11-23T03:12:18.8832740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8832829Z p_assert( 2022-11-23T03:12:18.8833144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8833253Z traceback.print_stack() 2022-11-23T03:12:18.8833473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8833688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8833901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8834111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8834225Z File "", line 1, in 2022-11-23T03:12:18.8834409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8834583Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8834770Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8834903Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8835095Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8835185Z self.run() 2022-11-23T03:12:18.8835370Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8835499Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8835812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8835934Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8836272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8836554Z getattr(self, test_name)() 2022-11-23T03:12:18.8836911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8836998Z fn() 2022-11-23T03:12:18.8837351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8837460Z test(self, **param_kwargs) 2022-11-23T03:12:18.8837798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8837913Z return func(*args, **kwargs) 2022-11-23T03:12:18.8838154Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8838256Z self.run_subtests( 2022-11-23T03:12:18.8838595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8838747Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8839103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8839247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8839602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8839711Z output = model(*input) 2022-11-23T03:12:18.8840023Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8840151Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8840513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8840679Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8841032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8841146Z _lazy_init(state, module) 2022-11-23T03:12:18.8841525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8841664Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8841990Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8842103Z return func(*args, **kwargs) 2022-11-23T03:12:18.8842465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8842555Z p_assert( 2022-11-23T03:12:18.8842877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8842991Z traceback.print_stack() 2022-11-23T03:12:18.8843103Z File "", line 1, in 2022-11-23T03:12:18.8843303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8843480Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8843676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8843814Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8844014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8844107Z self.run() 2022-11-23T03:12:18.8844290Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8844425Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8844754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8844875Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8845225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8845336Z getattr(self, test_name)() 2022-11-23T03:12:18.8845843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8845928Z fn() 2022-11-23T03:12:18.8846262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8846371Z test(self, **param_kwargs) 2022-11-23T03:12:18.8846704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8846813Z return func(*args, **kwargs) 2022-11-23T03:12:18.8847046Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8847145Z self.run_subtests( 2022-11-23T03:12:18.8847472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8847617Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8847954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8848092Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8848624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8848735Z output = model(*input) 2022-11-23T03:12:18.8849051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8849180Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8849544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8849709Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8850053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8850168Z _lazy_init(state, module) 2022-11-23T03:12:18.8850332Z File "", line 1, in 2022-11-23T03:12:18.8850682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8850814Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8851140Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8851255Z return func(*args, **kwargs) 2022-11-23T03:12:18.8851452Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8851576Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8851940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8852030Z p_assert( 2022-11-23T03:12:18.8852224Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8852569Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8852882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8852993Z traceback.print_stack() 2022-11-23T03:12:18.8853187Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8853269Z self.run() 2022-11-23T03:12:18.8853452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8853581Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8853892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8854009Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8854527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8854647Z getattr(self, test_name)() 2022-11-23T03:12:18.8854991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8855080Z fn() 2022-11-23T03:12:18.8855431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8855540Z test(self, **param_kwargs) 2022-11-23T03:12:18.8855886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8856001Z return func(*args, **kwargs) 2022-11-23T03:12:18.8856245Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8856346Z self.run_subtests( 2022-11-23T03:12:18.8856676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8856827Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8857348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8857488Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8857839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8857942Z output = model(*input) 2022-11-23T03:12:18.8858244Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8858369Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8858712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8858874Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8859212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8859374Z _lazy_init(state, module) 2022-11-23T03:12:18.8859890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8860021Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8860345Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8860457Z return func(*args, **kwargs) 2022-11-23T03:12:18.8860813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8860904Z p_assert( 2022-11-23T03:12:18.8861228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8861340Z traceback.print_stack() 2022-11-23T03:12:18.8861461Z File "", line 1, in 2022-11-23T03:12:18.8861658Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8861843Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8862034Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8862166Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8862368Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8862458Z self.run() 2022-11-23T03:12:18.8862796Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8862929Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8863245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8863361Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8863689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8863803Z getattr(self, test_name)() 2022-11-23T03:12:18.8864541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8864625Z fn() 2022-11-23T03:12:18.8864975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8865086Z test(self, **param_kwargs) 2022-11-23T03:12:18.8865427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8865541Z return func(*args, **kwargs) 2022-11-23T03:12:18.8865775Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8865878Z self.run_subtests( 2022-11-23T03:12:18.8866215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8866370Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8866725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8866866Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8867230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8867337Z output = model(*input) 2022-11-23T03:12:18.8867642Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8867772Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8868137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8868302Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8868657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8868839Z _lazy_init(state, module) 2022-11-23T03:12:18.8869189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8869321Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8869644Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8869751Z return func(*args, **kwargs) 2022-11-23T03:12:18.8870115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8870204Z p_assert( 2022-11-23T03:12:18.8870527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8870641Z traceback.print_stack() 2022-11-23T03:12:18.8870865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8871156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8871371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8871597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8871876Z File "", line 1, in 2022-11-23T03:12:18.8872072Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8872198Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8872381Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8872515Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8872710Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8872792Z self.run() 2022-11-23T03:12:18.8872975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8873113Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8873433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8873550Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8873887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8873995Z getattr(self, test_name)() 2022-11-23T03:12:18.8874330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8874406Z fn() 2022-11-23T03:12:18.8874746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8875030Z test(self, **param_kwargs) 2022-11-23T03:12:18.8875376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8875497Z return func(*args, **kwargs) 2022-11-23T03:12:18.8875741Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8875842Z self.run_subtests( 2022-11-23T03:12:18.8876181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8876324Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8876673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8876814Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8877177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8877287Z output = model(*input) 2022-11-23T03:12:18.8877597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8877776Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8878148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8878304Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8878818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8878924Z _lazy_init(state, module) 2022-11-23T03:12:18.8879253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8879381Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8879693Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8879801Z return func(*args, **kwargs) 2022-11-23T03:12:18.8880216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8880296Z p_assert( 2022-11-23T03:12:18.8880794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8880910Z traceback.print_stack() 2022-11-23T03:12:18.8881029Z File "", line 1, in 2022-11-23T03:12:18.8881226Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8881358Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8881545Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8881676Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8881877Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8881968Z self.run() 2022-11-23T03:12:18.8882159Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8882302Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8882631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8882755Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8883102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8883206Z getattr(self, test_name)() 2022-11-23T03:12:18.8883553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8883639Z fn() 2022-11-23T03:12:18.8883993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8884106Z test(self, **param_kwargs) 2022-11-23T03:12:18.8884449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8884570Z return func(*args, **kwargs) 2022-11-23T03:12:18.8884812Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8884908Z self.run_subtests( 2022-11-23T03:12:18.8885250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8885403Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8885748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8885888Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8886252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8886359Z output = model(*input) 2022-11-23T03:12:18.8886675Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8886845Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8887217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8887385Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8887791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8887902Z _lazy_init(state, module) 2022-11-23T03:12:18.8888179Z File "", line 1, in 2022-11-23T03:12:18.8888510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8888637Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8888941Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8889114Z return func(*args, **kwargs) 2022-11-23T03:12:18.8889312Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8889442Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8889981Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8890075Z p_assert( 2022-11-23T03:12:18.8890270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8890410Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8890725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8890839Z traceback.print_stack() 2022-11-23T03:12:18.8891042Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8891135Z self.run() 2022-11-23T03:12:18.8891334Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8891471Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8891798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8891915Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8892261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8892374Z getattr(self, test_name)() 2022-11-23T03:12:18.8892716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8892805Z fn() 2022-11-23T03:12:18.8892925Z File "", line 1, in 2022-11-23T03:12:18.8893277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8893388Z test(self, **param_kwargs) 2022-11-23T03:12:18.8893732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8893848Z return func(*args, **kwargs) 2022-11-23T03:12:18.8894045Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8894178Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8894416Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8894677Z self.run_subtests( 2022-11-23T03:12:18.8894860Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8894994Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8895317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8895464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8895663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8895802Z self.run() 2022-11-23T03:12:18.8896157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8896294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8896484Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8896611Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8896954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8897060Z output = model(*input) 2022-11-23T03:12:18.8897374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8897670Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8897985Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8898169Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8898517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8898632Z getattr(self, test_name)() 2022-11-23T03:12:18.8898986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8899150Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8899493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8899581Z fn() 2022-11-23T03:12:18.8899935Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8900048Z _lazy_init(state, module) 2022-11-23T03:12:18.8900410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8900523Z test(self, **param_kwargs) 2022-11-23T03:12:18.8900851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8900985Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8901483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8901595Z return func(*args, **kwargs) 2022-11-23T03:12:18.8901910Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8902018Z return func(*args, **kwargs) 2022-11-23T03:12:18.8902249Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8902342Z self.run_subtests( 2022-11-23T03:12:18.8902705Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8902793Z p_assert( 2022-11-23T03:12:18.8903117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8903260Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8903572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8903682Z traceback.print_stack() 2022-11-23T03:12:18.8904223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8904536Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8904906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8905014Z output = model(*input) 2022-11-23T03:12:18.8905393Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8905531Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8905897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8906060Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8906413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8906524Z _lazy_init(state, module) 2022-11-23T03:12:18.8906852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8906983Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8907468Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8907642Z return func(*args, **kwargs) 2022-11-23T03:12:18.8908003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8908093Z p_assert( 2022-11-23T03:12:18.8908405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8908508Z traceback.print_stack() 2022-11-23T03:12:18.8908732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8908947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8909161Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8909376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8909492Z File "", line 1, in 2022-11-23T03:12:18.8909683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8909819Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8909996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8910130Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8910323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8910411Z self.run() 2022-11-23T03:12:18.8910595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8910724Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8911043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8911163Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8911495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8911609Z getattr(self, test_name)() 2022-11-23T03:12:18.8911944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8912032Z fn() 2022-11-23T03:12:18.8912374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8912484Z test(self, **param_kwargs) 2022-11-23T03:12:18.8912818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8912926Z return func(*args, **kwargs) 2022-11-23T03:12:18.8913151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8913249Z self.run_subtests( 2022-11-23T03:12:18.8913574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8913722Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8914110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8914255Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8914608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8914712Z output = model(*input) 2022-11-23T03:12:18.8915006Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8915130Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8915477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8915634Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8915974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8916131Z _lazy_init(state, module) 2022-11-23T03:12:18.8916637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8916771Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8917089Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8917201Z return func(*args, **kwargs) 2022-11-23T03:12:18.8917568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8917658Z p_assert( 2022-11-23T03:12:18.8917985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8918099Z traceback.print_stack() 2022-11-23T03:12:18.8918217Z File "", line 1, in 2022-11-23T03:12:18.8918422Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8918550Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8918742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8918881Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8919081Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8919175Z self.run() 2022-11-23T03:12:18.8919367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8919505Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8919825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8919952Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8920301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8920418Z getattr(self, test_name)() 2022-11-23T03:12:18.8920766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8920853Z fn() 2022-11-23T03:12:18.8921206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8921318Z test(self, **param_kwargs) 2022-11-23T03:12:18.8921654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8921768Z return func(*args, **kwargs) 2022-11-23T03:12:18.8922007Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8922108Z self.run_subtests( 2022-11-23T03:12:18.8922446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8922600Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8922995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8923142Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8923496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8923604Z output = model(*input) 2022-11-23T03:12:18.8923915Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8924043Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8924570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8924903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8925256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8925414Z _lazy_init(state, module) 2022-11-23T03:12:18.8925748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8925879Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8926202Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8926314Z return func(*args, **kwargs) 2022-11-23T03:12:18.8926677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8926769Z p_assert( 2022-11-23T03:12:18.8927091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8927206Z traceback.print_stack() 2022-11-23T03:12:18.8927317Z File "", line 1, in 2022-11-23T03:12:18.8927680Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8927811Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8927996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8928129Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8928323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8928412Z self.run() 2022-11-23T03:12:18.8928589Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8928719Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8929216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8929337Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8929454Z File "", line 1, in 2022-11-23T03:12:18.8929802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8929920Z getattr(self, test_name)() 2022-11-23T03:12:18.8930313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8930396Z fn() 2022-11-23T03:12:18.8930592Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8930725Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8931080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8931194Z test(self, **param_kwargs) 2022-11-23T03:12:18.8931387Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8931526Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8931867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8931978Z return func(*args, **kwargs) 2022-11-23T03:12:18.8932381Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8932480Z self.run() 2022-11-23T03:12:18.8932715Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8932814Z self.run_subtests( 2022-11-23T03:12:18.8932997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8933127Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8933460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8933599Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8933912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8934077Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8934419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8934553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8934888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8934995Z getattr(self, test_name)() 2022-11-23T03:12:18.8935344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8935442Z output = model(*input) 2022-11-23T03:12:18.8935773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8935856Z fn() 2022-11-23T03:12:18.8936158Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8936287Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8936627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8936737Z test(self, **param_kwargs) 2022-11-23T03:12:18.8937264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8937429Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8937778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8937891Z return func(*args, **kwargs) 2022-11-23T03:12:18.8938244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8938354Z _lazy_init(state, module) 2022-11-23T03:12:18.8938593Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8938698Z self.run_subtests( 2022-11-23T03:12:18.8939042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8939166Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8939507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8939657Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8939982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8940095Z return func(*args, **kwargs) 2022-11-23T03:12:18.8940608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8940744Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8941281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8941454Z p_assert( 2022-11-23T03:12:18.8941824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8941937Z output = model(*input) 2022-11-23T03:12:18.8942259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8942372Z traceback.print_stack() 2022-11-23T03:12:18.8942682Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8942813Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8943173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8943331Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8943740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8944062Z _lazy_init(state, module) 2022-11-23T03:12:18.8944575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8944702Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8945016Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8945123Z return func(*args, **kwargs) 2022-11-23T03:12:18.8945474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8945555Z p_assert( 2022-11-23T03:12:18.8945867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8945977Z traceback.print_stack() 2022-11-23T03:12:18.8946195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8946418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8946631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8946845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8946959Z File "", line 1, in 2022-11-23T03:12:18.8947145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8947272Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8947455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8947588Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8947780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8947871Z self.run() 2022-11-23T03:12:18.8948060Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8948187Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8948504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8948622Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8949145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8949258Z getattr(self, test_name)() 2022-11-23T03:12:18.8949601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8949688Z fn() 2022-11-23T03:12:18.8950045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8950150Z test(self, **param_kwargs) 2022-11-23T03:12:18.8950490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8950680Z return func(*args, **kwargs) 2022-11-23T03:12:18.8950930Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8951032Z self.run_subtests( 2022-11-23T03:12:18.8951371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8951520Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8951871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8952005Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8952365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8952474Z output = model(*input) 2022-11-23T03:12:18.8953026Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8953149Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8953498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8953657Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8954001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8954100Z _lazy_init(state, module) 2022-11-23T03:12:18.8954425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8954551Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8954865Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8954979Z return func(*args, **kwargs) 2022-11-23T03:12:18.8955334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8955424Z p_assert( 2022-11-23T03:12:18.8955738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8955842Z traceback.print_stack() 2022-11-23T03:12:18.8955956Z File "", line 1, in 2022-11-23T03:12:18.8956148Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8956277Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8956463Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8956774Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8956975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8957065Z self.run() 2022-11-23T03:12:18.8957260Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8957394Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8957722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8957846Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8958192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8958303Z getattr(self, test_name)() 2022-11-23T03:12:18.8958646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8958726Z fn() 2022-11-23T03:12:18.8959077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8959188Z test(self, **param_kwargs) 2022-11-23T03:12:18.8959749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8959865Z return func(*args, **kwargs) 2022-11-23T03:12:18.8960274Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8960377Z self.run_subtests( 2022-11-23T03:12:18.8960717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8960860Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8961210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8961352Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8961714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8961873Z output = model(*input) 2022-11-23T03:12:18.8962195Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8962323Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8962689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8962849Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8963200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8963312Z _lazy_init(state, module) 2022-11-23T03:12:18.8963649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8963780Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8964105Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8964222Z return func(*args, **kwargs) 2022-11-23T03:12:18.8964592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8964678Z p_assert( 2022-11-23T03:12:18.8965002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8965118Z traceback.print_stack() 2022-11-23T03:12:18.8965236Z File "", line 1, in 2022-11-23T03:12:18.8965433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8965666Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8965862Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8966002Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8966195Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8966290Z self.run() 2022-11-23T03:12:18.8966485Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8966621Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8966951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8967073Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8967422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8967689Z getattr(self, test_name)() 2022-11-23T03:12:18.8968208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8968297Z fn() 2022-11-23T03:12:18.8968645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8968758Z test(self, **param_kwargs) 2022-11-23T03:12:18.8969154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8969273Z return func(*args, **kwargs) 2022-11-23T03:12:18.8969514Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8969610Z self.run_subtests( 2022-11-23T03:12:18.8969949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8970097Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8970445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8970586Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8971100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8971251Z output = model(*input) 2022-11-23T03:12:18.8971556Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8971675Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8972203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8972368Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8972721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8972830Z _lazy_init(state, module) 2022-11-23T03:12:18.8973167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8973297Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8973618Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8973734Z return func(*args, **kwargs) 2022-11-23T03:12:18.8974096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8974187Z p_assert( 2022-11-23T03:12:18.8974513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8974626Z traceback.print_stack() 2022-11-23T03:12:18.8974742Z File "", line 1, in 2022-11-23T03:12:18.8975103Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8975230Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8975405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8975540Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8975734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8975828Z self.run() 2022-11-23T03:12:18.8976014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8976144Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8976459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8976578Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8976908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8977018Z getattr(self, test_name)() 2022-11-23T03:12:18.8977348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8977431Z fn() 2022-11-23T03:12:18.8977770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8977878Z test(self, **param_kwargs) 2022-11-23T03:12:18.8978440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8978559Z return func(*args, **kwargs) 2022-11-23T03:12:18.8978793Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8978894Z self.run_subtests( 2022-11-23T03:12:18.8979234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8979385Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8979736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8979876Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8980237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8980390Z output = model(*input) 2022-11-23T03:12:18.8980702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8980833Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8981199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8981362Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8981721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8981831Z _lazy_init(state, module) 2022-11-23T03:12:18.8982167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8982300Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8982620Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8982743Z return func(*args, **kwargs) 2022-11-23T03:12:18.8983111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8983202Z p_assert( 2022-11-23T03:12:18.8983527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8983641Z traceback.print_stack() 2022-11-23T03:12:18.8984058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8984297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8984513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8984892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.8985009Z File "", line 1, in 2022-11-23T03:12:18.8985208Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8985513Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8985710Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8985849Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8986040Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8986134Z self.run() 2022-11-23T03:12:18.8986324Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8986459Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8986797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8986919Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8987266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8987384Z getattr(self, test_name)() 2022-11-23T03:12:18.8987840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8987937Z fn() 2022-11-23T03:12:18.8988293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8988563Z test(self, **param_kwargs) 2022-11-23T03:12:18.8988900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8989012Z return func(*args, **kwargs) 2022-11-23T03:12:18.8989244Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8989342Z self.run_subtests( 2022-11-23T03:12:18.8989662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8989873Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8990398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.8990540Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.8990903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.8991009Z output = model(*input) 2022-11-23T03:12:18.8991321Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.8991452Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.8991809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.8991973Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.8992333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.8992442Z _lazy_init(state, module) 2022-11-23T03:12:18.8992778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.8992910Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.8993235Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.8993346Z return func(*args, **kwargs) 2022-11-23T03:12:18.8993704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.8993796Z p_assert( 2022-11-23T03:12:18.8994120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.8994236Z traceback.print_stack() 2022-11-23T03:12:18.8994355Z File "", line 1, in 2022-11-23T03:12:18.8994561Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8994694Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8995040Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.8995168Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.8995364Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.8995454Z self.run() 2022-11-23T03:12:18.8995639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.8995767Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.8996083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.8996201Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.8996530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.8996684Z getattr(self, test_name)() 2022-11-23T03:12:18.8997026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.8997109Z fn() 2022-11-23T03:12:18.8997444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.8997552Z test(self, **param_kwargs) 2022-11-23T03:12:18.8998061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.8998176Z return func(*args, **kwargs) 2022-11-23T03:12:18.8998413Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.8998515Z self.run_subtests( 2022-11-23T03:12:18.8998633Z File "", line 1, in 2022-11-23T03:12:18.8999029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.8999181Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.8999378Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.8999562Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.8999914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9000050Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9000241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9000379Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9000741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9000849Z output = model(*input) 2022-11-23T03:12:18.9001055Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9001151Z self.run() 2022-11-23T03:12:18.9001468Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9001593Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9001786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9001921Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9002444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9002605Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9002920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9003037Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9003563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9003674Z _lazy_init(state, module) 2022-11-23T03:12:18.9004022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9004134Z getattr(self, test_name)() 2022-11-23T03:12:18.9004470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9004600Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9004945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9005035Z fn() 2022-11-23T03:12:18.9005357Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9005465Z return func(*args, **kwargs) 2022-11-23T03:12:18.9005818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9005993Z test(self, **param_kwargs) 2022-11-23T03:12:18.9006350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9006462Z return func(*args, **kwargs) 2022-11-23T03:12:18.9006826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9006917Z p_assert( 2022-11-23T03:12:18.9007151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9007253Z self.run_subtests( 2022-11-23T03:12:18.9007576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9007690Z traceback.print_stack() 2022-11-23T03:12:18.9008028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9008237Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9008591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9008730Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9009084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9009192Z output = model(*input) 2022-11-23T03:12:18.9009665Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9009791Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9010139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9010473Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9010836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9010949Z _lazy_init(state, module) 2022-11-23T03:12:18.9011287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9011412Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9011735Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9011847Z return func(*args, **kwargs) 2022-11-23T03:12:18.9012213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9012304Z p_assert( 2022-11-23T03:12:18.9012627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9012743Z traceback.print_stack() 2022-11-23T03:12:18.9012858Z File "", line 1, in 2022-11-23T03:12:18.9013061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9013193Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9013384Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9013524Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9013725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9013815Z self.run() 2022-11-23T03:12:18.9014009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9014138Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9014460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9014581Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9014930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9015090Z getattr(self, test_name)() 2022-11-23T03:12:18.9015445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9015531Z fn() 2022-11-23T03:12:18.9016044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9016147Z test(self, **param_kwargs) 2022-11-23T03:12:18.9016480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9016591Z return func(*args, **kwargs) 2022-11-23T03:12:18.9016823Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9017101Z self.run_subtests( 2022-11-23T03:12:18.9017440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9017657Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9018015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9018151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9018514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9018620Z output = model(*input) 2022-11-23T03:12:18.9018933Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9019061Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9019427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9019589Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9020103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9020204Z _lazy_init(state, module) 2022-11-23T03:12:18.9020531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9020659Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9020970Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9021079Z return func(*args, **kwargs) 2022-11-23T03:12:18.9021611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9021702Z p_assert( 2022-11-23T03:12:18.9022026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9022134Z traceback.print_stack() 2022-11-23T03:12:18.9022365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9022589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9022812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9023032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9023149Z File "", line 1, in 2022-11-23T03:12:18.9023350Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9023480Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9023664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9023803Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9024214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9024317Z self.run() 2022-11-23T03:12:18.9024579Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9024723Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9025057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9025172Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9025522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9025634Z getattr(self, test_name)() 2022-11-23T03:12:18.9025983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9026069Z fn() 2022-11-23T03:12:18.9026418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9026529Z test(self, **param_kwargs) 2022-11-23T03:12:18.9026941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9027048Z return func(*args, **kwargs) 2022-11-23T03:12:18.9027290Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9027391Z self.run_subtests( 2022-11-23T03:12:18.9027729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9028039Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9028374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9028509Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9028858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9028961Z output = model(*input) 2022-11-23T03:12:18.9029437Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9029570Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9029935Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9030097Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9030494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9030606Z _lazy_init(state, module) 2022-11-23T03:12:18.9030945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9031069Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9031393Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9031514Z return func(*args, **kwargs) 2022-11-23T03:12:18.9031881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9031973Z p_assert( 2022-11-23T03:12:18.9032293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9032567Z traceback.print_stack() 2022-11-23T03:12:18.9032684Z File "", line 1, in 2022-11-23T03:12:18.9032867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9032995Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9033176Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9033310Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9033503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9033596Z self.run() 2022-11-23T03:12:18.9033828Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9033957Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9034276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9034395Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9034508Z File "", line 1, in 2022-11-23T03:12:18.9034842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9034950Z getattr(self, test_name)() 2022-11-23T03:12:18.9035283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9035368Z fn() 2022-11-23T03:12:18.9035550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9035723Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9036071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9036181Z test(self, **param_kwargs) 2022-11-23T03:12:18.9036364Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9036496Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9036827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9036937Z return func(*args, **kwargs) 2022-11-23T03:12:18.9037300Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9037393Z self.run() 2022-11-23T03:12:18.9037634Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9037735Z self.run_subtests( 2022-11-23T03:12:18.9037934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9038068Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9038412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9038562Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9038882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9039005Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9039353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9039494Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9039839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9039957Z getattr(self, test_name)() 2022-11-23T03:12:18.9040318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9040427Z output = model(*input) 2022-11-23T03:12:18.9040921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9041005Z fn() 2022-11-23T03:12:18.9041485Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9041616Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9041967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9042078Z test(self, **param_kwargs) 2022-11-23T03:12:18.9042436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9042605Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9043066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9043187Z return func(*args, **kwargs) 2022-11-23T03:12:18.9043542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9043653Z _lazy_init(state, module) 2022-11-23T03:12:18.9043894Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9043998Z self.run_subtests( 2022-11-23T03:12:18.9044337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9044630Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9045134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9045329Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9045659Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9045772Z return func(*args, **kwargs) 2022-11-23T03:12:18.9046123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9046264Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9046630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9046722Z p_assert( 2022-11-23T03:12:18.9047081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9047192Z output = model(*input) 2022-11-23T03:12:18.9047515Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9047632Z traceback.print_stack() 2022-11-23T03:12:18.9048101Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9048231Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9048580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9048737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9049247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9049360Z _lazy_init(state, module) 2022-11-23T03:12:18.9049695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9049825Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9050147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9050268Z return func(*args, **kwargs) 2022-11-23T03:12:18.9050632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9050723Z p_assert( 2022-11-23T03:12:18.9051041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9051156Z traceback.print_stack() 2022-11-23T03:12:18.9051275Z File "", line 1, in 2022-11-23T03:12:18.9051475Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9051608Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9051801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9051941Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9052133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9052228Z self.run() 2022-11-23T03:12:18.9052466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9052611Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9052940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9053224Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9053560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9053668Z getattr(self, test_name)() 2022-11-23T03:12:18.9053994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9054078Z fn() 2022-11-23T03:12:18.9054416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9054569Z test(self, **param_kwargs) 2022-11-23T03:12:18.9054904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9055013Z return func(*args, **kwargs) 2022-11-23T03:12:18.9055247Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9055345Z self.run_subtests( 2022-11-23T03:12:18.9055664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9055810Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9056147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9056281Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9056635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9056748Z output = model(*input) 2022-11-23T03:12:18.9057055Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9057180Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9057524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9057682Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9058022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9058129Z _lazy_init(state, module) 2022-11-23T03:12:18.9058455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9058581Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9058897Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9059009Z return func(*args, **kwargs) 2022-11-23T03:12:18.9059354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9059446Z p_assert( 2022-11-23T03:12:18.9059758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9059866Z traceback.print_stack() 2022-11-23T03:12:18.9060083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9060300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9060690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9060910Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9061026Z File "", line 1, in 2022-11-23T03:12:18.9061270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9061407Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9061598Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9061737Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9061936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9062029Z self.run() 2022-11-23T03:12:18.9062215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9062352Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9062679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9062801Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9063153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9063317Z getattr(self, test_name)() 2022-11-23T03:12:18.9063671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9063758Z fn() 2022-11-23T03:12:18.9064392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9064507Z test(self, **param_kwargs) 2022-11-23T03:12:18.9064855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9064966Z return func(*args, **kwargs) 2022-11-23T03:12:18.9065208Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9065311Z self.run_subtests( 2022-11-23T03:12:18.9065648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9065806Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9066147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9066288Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9066649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9066755Z output = model(*input) 2022-11-23T03:12:18.9067072Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9067201Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9067563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9067726Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9068412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9068527Z _lazy_init(state, module) 2022-11-23T03:12:18.9068867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9068998Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9069321Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9069434Z return func(*args, **kwargs) 2022-11-23T03:12:18.9069801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9069894Z p_assert( 2022-11-23T03:12:18.9070210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9070323Z traceback.print_stack() 2022-11-23T03:12:18.9070444Z File "", line 1, in 2022-11-23T03:12:18.9070716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9070856Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9070971Z File "", line 1, in 2022-11-23T03:12:18.9071323Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9071462Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9071647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9071773Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9071967Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9072058Z self.run() 2022-11-23T03:12:18.9072242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9072379Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9072799Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9072928Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9073128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9073219Z self.run() 2022-11-23T03:12:18.9073551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9073675Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9073864Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9073994Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9074340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9074446Z getattr(self, test_name)() 2022-11-23T03:12:18.9074770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9074901Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9075250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9075336Z fn() 2022-11-23T03:12:18.9075680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9075792Z getattr(self, test_name)() 2022-11-23T03:12:18.9076146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9076257Z test(self, **param_kwargs) 2022-11-23T03:12:18.9076606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9076690Z fn() 2022-11-23T03:12:18.9077035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9077152Z return func(*args, **kwargs) 2022-11-23T03:12:18.9077507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9077619Z test(self, **param_kwargs) 2022-11-23T03:12:18.9077851Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9077951Z self.run_subtests( 2022-11-23T03:12:18.9078292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9078404Z return func(*args, **kwargs) 2022-11-23T03:12:18.9078743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9078893Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9079132Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9079282Z self.run_subtests( 2022-11-23T03:12:18.9079793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9079935Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9080259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9080575Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9080938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9081048Z output = model(*input) 2022-11-23T03:12:18.9081396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9081536Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9081952Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9082076Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9082440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9082547Z output = model(*input) 2022-11-23T03:12:18.9082912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9083076Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9083391Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9083521Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9083872Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9083979Z _lazy_init(state, module) 2022-11-23T03:12:18.9084343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9084507Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9084849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9084981Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9085659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9085772Z _lazy_init(state, module) 2022-11-23T03:12:18.9086096Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9086203Z return func(*args, **kwargs) 2022-11-23T03:12:18.9086536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9086670Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9087040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9087133Z p_assert( 2022-11-23T03:12:18.9087457Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9087569Z return func(*args, **kwargs) 2022-11-23T03:12:18.9087950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9088057Z traceback.print_stack() 2022-11-23T03:12:18.9088419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9088509Z p_assert( 2022-11-23T03:12:18.9088829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9088945Z traceback.print_stack() 2022-11-23T03:12:18.9089111Z File "", line 1, in 2022-11-23T03:12:18.9089322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9089444Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9089634Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9089933Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9090128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9090217Z self.run() 2022-11-23T03:12:18.9090399Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9090709Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9091036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9091151Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9091552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9091663Z getattr(self, test_name)() 2022-11-23T03:12:18.9092007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9092091Z fn() 2022-11-23T03:12:18.9092442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9092552Z test(self, **param_kwargs) 2022-11-23T03:12:18.9092896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9093003Z return func(*args, **kwargs) 2022-11-23T03:12:18.9093243Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9093345Z self.run_subtests( 2022-11-23T03:12:18.9093689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9093840Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9094189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9094329Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9094691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9094794Z output = model(*input) 2022-11-23T03:12:18.9095105Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9095391Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9095741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9095906Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9096250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9096355Z _lazy_init(state, module) 2022-11-23T03:12:18.9096681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9096800Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9097117Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9097225Z return func(*args, **kwargs) 2022-11-23T03:12:18.9097579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9097666Z p_assert( 2022-11-23T03:12:18.9097977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9098091Z traceback.print_stack() 2022-11-23T03:12:18.9098539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9098765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9098986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9099208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9099328Z File "", line 1, in 2022-11-23T03:12:18.9099523Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9099654Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9099843Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9099980Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9100177Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9100319Z self.run() 2022-11-23T03:12:18.9100509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9100641Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9100973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9101095Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9101442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9101546Z getattr(self, test_name)() 2022-11-23T03:12:18.9102052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9102134Z fn() 2022-11-23T03:12:18.9102650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9102766Z test(self, **param_kwargs) 2022-11-23T03:12:18.9103113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9103230Z return func(*args, **kwargs) 2022-11-23T03:12:18.9103472Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9103568Z self.run_subtests( 2022-11-23T03:12:18.9104120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9104284Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9104642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9104783Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9105145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9105261Z output = model(*input) 2022-11-23T03:12:18.9105575Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9105698Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9106061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9106225Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9106577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9106689Z _lazy_init(state, module) 2022-11-23T03:12:18.9107025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9107155Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9107483Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9107657Z return func(*args, **kwargs) 2022-11-23T03:12:18.9108198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9108286Z p_assert( 2022-11-23T03:12:18.9108599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9108712Z traceback.print_stack() 2022-11-23T03:12:18.9108825Z File "", line 1, in 2022-11-23T03:12:18.9109016Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9109144Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9109320Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9109454Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9109711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9109808Z self.run() 2022-11-23T03:12:18.9109994Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9110124Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9110438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9110554Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9110883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9110991Z getattr(self, test_name)() 2022-11-23T03:12:18.9111323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9111407Z fn() 2022-11-23T03:12:18.9111520Z File "", line 1, in 2022-11-23T03:12:18.9111858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9111972Z test(self, **param_kwargs) 2022-11-23T03:12:18.9112301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9112411Z return func(*args, **kwargs) 2022-11-23T03:12:18.9112599Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9112728Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9112961Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9113058Z self.run_subtests( 2022-11-23T03:12:18.9113241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9113376Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9113700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9113855Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9114051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9114143Z self.run() 2022-11-23T03:12:18.9114482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9114617Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9114802Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9114934Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9115279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9115383Z output = model(*input) 2022-11-23T03:12:18.9115694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9115814Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9116161Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9116293Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9116630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9116740Z getattr(self, test_name)() 2022-11-23T03:12:18.9117083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9117421Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9117771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9117857Z fn() 2022-11-23T03:12:18.9118211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9118375Z _lazy_init(state, module) 2022-11-23T03:12:18.9118730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9118841Z test(self, **param_kwargs) 2022-11-23T03:12:18.9119171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9119301Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9119648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9119761Z return func(*args, **kwargs) 2022-11-23T03:12:18.9120085Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9120357Z return func(*args, **kwargs) 2022-11-23T03:12:18.9120591Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9120697Z self.run_subtests( 2022-11-23T03:12:18.9121047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9121134Z p_assert( 2022-11-23T03:12:18.9121460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9121606Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9122108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9122222Z traceback.print_stack() 2022-11-23T03:12:18.9122571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9122711Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9123066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9123181Z output = model(*input) 2022-11-23T03:12:18.9123495Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9123627Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9123993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9124157Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9124508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9124618Z _lazy_init(state, module) 2022-11-23T03:12:18.9124729Z File "", line 1, in 2022-11-23T03:12:18.9125228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9125528Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9125903Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9126021Z return func(*args, **kwargs) 2022-11-23T03:12:18.9126219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9126349Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9126714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9126799Z p_assert( 2022-11-23T03:12:18.9126990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9127126Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9127447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9127560Z traceback.print_stack() 2022-11-23T03:12:18.9127811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9127903Z self.run() 2022-11-23T03:12:18.9128087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9128223Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9128543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9128663Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9129007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9129119Z getattr(self, test_name)() 2022-11-23T03:12:18.9129462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9129547Z fn() 2022-11-23T03:12:18.9129891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9130009Z test(self, **param_kwargs) 2022-11-23T03:12:18.9130396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9130511Z return func(*args, **kwargs) 2022-11-23T03:12:18.9130752Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9130853Z self.run_subtests( 2022-11-23T03:12:18.9131192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9131342Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9131689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9131829Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9132201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9132318Z output = model(*input) 2022-11-23T03:12:18.9132631Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9132917Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9133269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9133431Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9133765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9133872Z _lazy_init(state, module) 2022-11-23T03:12:18.9134198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9134325Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9134689Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9134802Z return func(*args, **kwargs) 2022-11-23T03:12:18.9135158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9135249Z p_assert( 2022-11-23T03:12:18.9135558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9135669Z traceback.print_stack() 2022-11-23T03:12:18.9135892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9136107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9136320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9136533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9136700Z File "", line 1, in 2022-11-23T03:12:18.9136893Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9137014Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9137198Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9137331Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9137698Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9137793Z self.run() 2022-11-23T03:12:18.9137985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9138118Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9138439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9138559Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9138915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9139029Z getattr(self, test_name)() 2022-11-23T03:12:18.9139378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9139465Z fn() 2022-11-23T03:12:18.9139819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9139931Z test(self, **param_kwargs) 2022-11-23T03:12:18.9140267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9140381Z return func(*args, **kwargs) 2022-11-23T03:12:18.9140623Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9140723Z self.run_subtests( 2022-11-23T03:12:18.9141069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9141219Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9141569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9141710Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9142064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9142172Z output = model(*input) 2022-11-23T03:12:18.9142483Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9142612Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9142974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9143139Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9143532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9143647Z _lazy_init(state, module) 2022-11-23T03:12:18.9144199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9144340Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9144672Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9144783Z return func(*args, **kwargs) 2022-11-23T03:12:18.9145149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9145239Z p_assert( 2022-11-23T03:12:18.9145563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9145768Z traceback.print_stack() 2022-11-23T03:12:18.9145882Z File "", line 1, in 2022-11-23T03:12:18.9146081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9146213Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9146402Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9146539Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9146739Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9146832Z self.run() 2022-11-23T03:12:18.9147019Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9147146Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9147641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9147763Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9148104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9148213Z getattr(self, test_name)() 2022-11-23T03:12:18.9148547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9148631Z fn() 2022-11-23T03:12:18.9148962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9149069Z test(self, **param_kwargs) 2022-11-23T03:12:18.9149399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9149508Z return func(*args, **kwargs) 2022-11-23T03:12:18.9149920Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9150026Z self.run_subtests( 2022-11-23T03:12:18.9150372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9150522Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9150868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9151009Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9151372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9151484Z output = model(*input) 2022-11-23T03:12:18.9151603Z File "", line 1, in 2022-11-23T03:12:18.9151915Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9152044Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9152407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9152631Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9152839Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9152968Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9153324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9153433Z _lazy_init(state, module) 2022-11-23T03:12:18.9153620Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9153759Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9154095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9154372Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9154568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9154885Z self.run() 2022-11-23T03:12:18.9155218Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9155330Z return func(*args, **kwargs) 2022-11-23T03:12:18.9155521Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9155654Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9156018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9156103Z p_assert( 2022-11-23T03:12:18.9156426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9156546Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9156868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9156984Z traceback.print_stack() 2022-11-23T03:12:18.9157335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9157448Z getattr(self, test_name)() 2022-11-23T03:12:18.9157952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9158031Z fn() 2022-11-23T03:12:18.9158368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9158476Z test(self, **param_kwargs) 2022-11-23T03:12:18.9158805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9158914Z return func(*args, **kwargs) 2022-11-23T03:12:18.9159147Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9159249Z self.run_subtests( 2022-11-23T03:12:18.9159581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9159723Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9160067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9160202Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9160549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9160654Z output = model(*input) 2022-11-23T03:12:18.9160940Z File "", line 1, in 2022-11-23T03:12:18.9161253Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9161381Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9161738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9161952Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9162159Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9162290Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9162642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9162752Z _lazy_init(state, module) 2022-11-23T03:12:18.9162942Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9163074Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9163411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9163541Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9163740Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9164036Z self.run() 2022-11-23T03:12:18.9164352Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9164460Z return func(*args, **kwargs) 2022-11-23T03:12:18.9164645Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9164768Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9165301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9165393Z p_assert( 2022-11-23T03:12:18.9165717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9165839Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9166163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9166281Z traceback.print_stack() 2022-11-23T03:12:18.9167368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9167476Z getattr(self, test_name)() 2022-11-23T03:12:18.9167823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9167908Z fn() 2022-11-23T03:12:18.9168257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9168368Z test(self, **param_kwargs) 2022-11-23T03:12:18.9168708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9168821Z return func(*args, **kwargs) 2022-11-23T03:12:18.9169060Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9169159Z self.run_subtests( 2022-11-23T03:12:18.9169501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9169652Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9170001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9170143Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9170505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9170610Z output = model(*input) 2022-11-23T03:12:18.9170922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9171043Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9171407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9171623Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9172140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9172245Z _lazy_init(state, module) 2022-11-23T03:12:18.9172569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9172694Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9173006Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9173108Z return func(*args, **kwargs) 2022-11-23T03:12:18.9173457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9173545Z p_assert( 2022-11-23T03:12:18.9173861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9174024Z traceback.print_stack() 2022-11-23T03:12:18.9174241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9174455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9174670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9174877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9175084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9175468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9175684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9175902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9176122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9176336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9176548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9176755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9176965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9177179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9177393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9177605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9177816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9178034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9178249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9178460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9178665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9178876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9179245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9179449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9179652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9179860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9180107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9180319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9181043Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9181925Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9182719Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9183452Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9184369Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9185109Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9185831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9186553Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9187272Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9188036Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9188754Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9189675Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9190380Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9191256Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9192033Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9192749Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9192975Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9193197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9193415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9193640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9193862Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9194078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9194290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9194497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9194707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9194921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9195134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9195349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9195725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9195930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9196140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9196344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9196541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9196746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9196951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9197157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9197360Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9197614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9197825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9198030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9198227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9198430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9198813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9205213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9205655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9205884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9206211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9206436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9206647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9206869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9207084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9207301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9207513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9207727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9207946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9208163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9208538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9208738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9208942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9209148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9209351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9209557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9209761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9209974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9210728Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9211427Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9212185Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9212903Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9213600Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9214295Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9215059Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9215750Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9216446Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9217142Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9218019Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9218735Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9219456Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9220173Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9220882Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9221639Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9221869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9222092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9222310Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9222532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9222747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9223010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9223227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9223441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9223651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9224145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9224382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9224599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9224816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9225028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9225252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9225466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9225680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9225886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9226102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9226315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9226416Z dist init r=1, world=4 2022-11-23T03:12:18.9226738Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9227045Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9227344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9227636Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9227924Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9228206Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9228494Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9229050Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9229341Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9229618Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9229896Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9230373Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9230567Z dist init r=2, world=4 2022-11-23T03:12:18.9230887Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9231190Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9231485Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9231777Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9232060Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9232354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9232642Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9232934Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9233381Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9233660Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9233942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9234223Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9234320Z dist init r=3, world=4 2022-11-23T03:12:18.9234618Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9234910Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9235188Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9235512Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9235809Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9236089Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9236367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9236646Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9236924Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9237257Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9237545Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9237998Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9238100Z dist init r=0, world=4 2022-11-23T03:12:18.9238384Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9238674Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9238972Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9239261Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9239547Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9239837Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9240123Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9240420Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9240706Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9241154Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9241432Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9241709Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9241789Z ok (6.623s) 2022-11-23T03:12:18.9242326Z test_nested_wrapped_model_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25570 2022-11-23T03:12:18.9242548Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25571 2022-11-23T03:12:18.9242751Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25572 2022-11-23T03:12:18.9242954Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25573 2022-11-23T03:12:18.9243334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9243498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9243868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9244048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9244398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9244613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9244983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9245322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9245664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9245820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9246168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9246339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9246673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9246836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9247188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9247360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9247584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.9247810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.9248032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.9248250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.9248622Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9248993Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9249356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9249716Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9250104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.9250320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.9250536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.9250748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.9250974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9251247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9251466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9251688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9252701Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9252802Z warnings.warn( 2022-11-23T03:12:18.9253959Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9254103Z warnings.warn( 2022-11-23T03:12:18.9255071Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9255167Z warnings.warn( 2022-11-23T03:12:18.9256133Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9256228Z warnings.warn( 2022-11-23T03:12:18.9256341Z File "", line 1, in 2022-11-23T03:12:18.9256536Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9256660Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9256839Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9256972Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9257348Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9257446Z self.run() 2022-11-23T03:12:18.9257639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9257775Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9258106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9258229Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9258574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9258686Z getattr(self, test_name)() 2022-11-23T03:12:18.9259032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9259119Z fn() 2022-11-23T03:12:18.9259474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9259588Z test(self, **param_kwargs) 2022-11-23T03:12:18.9259977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9260251Z return func(*args, **kwargs) 2022-11-23T03:12:18.9260487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9260586Z self.run_subtests( 2022-11-23T03:12:18.9260916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9261065Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9261592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9261732Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9262094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9262248Z output = model(*input) 2022-11-23T03:12:18.9262565Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9262697Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9263062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9263227Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9263578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9263688Z _lazy_init(state, module) 2022-11-23T03:12:18.9264398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9264537Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9264853Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9264973Z return func(*args, **kwargs) 2022-11-23T03:12:18.9265329Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9265689Z p_assert( 2022-11-23T03:12:18.9266020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9266134Z traceback.print_stack() 2022-11-23T03:12:18.9266252Z File "", line 1, in 2022-11-23T03:12:18.9266445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9266575Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9266763Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9266902Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9267103Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9267201Z self.run() 2022-11-23T03:12:18.9267397Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9267530Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9267849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9267971Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9268315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9268427Z getattr(self, test_name)() 2022-11-23T03:12:18.9268774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9268859Z fn() 2022-11-23T03:12:18.9269209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9269326Z test(self, **param_kwargs) 2022-11-23T03:12:18.9269738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9269863Z return func(*args, **kwargs) 2022-11-23T03:12:18.9270107Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9270209Z self.run_subtests( 2022-11-23T03:12:18.9270550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9270699Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9271050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9271190Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9271710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9271892Z output = model(*input) 2022-11-23T03:12:18.9272198Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9272323Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9272862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9273029Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9273384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9273493Z _lazy_init(state, module) 2022-11-23T03:12:18.9273826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9273958Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9274282Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9274400Z return func(*args, **kwargs) 2022-11-23T03:12:18.9274765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9274856Z p_assert( 2022-11-23T03:12:18.9275178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9275452Z traceback.print_stack() 2022-11-23T03:12:18.9275561Z File "", line 1, in 2022-11-23T03:12:18.9275752Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9275879Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9276062Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9276195Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9276388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9276486Z self.run() 2022-11-23T03:12:18.9276666Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9276795Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9277112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9277230Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9277566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9277674Z getattr(self, test_name)() 2022-11-23T03:12:18.9278005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9278089Z fn() 2022-11-23T03:12:18.9278422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9278534Z test(self, **param_kwargs) 2022-11-23T03:12:18.9279097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9279216Z return func(*args, **kwargs) 2022-11-23T03:12:18.9279460Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9279563Z self.run_subtests( 2022-11-23T03:12:18.9279903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9280052Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9280397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9280537Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9280898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9281053Z output = model(*input) 2022-11-23T03:12:18.9281369Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9281661Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9282014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9282357Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9282704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9282815Z _lazy_init(state, module) 2022-11-23T03:12:18.9283152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9283282Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9283611Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9283724Z return func(*args, **kwargs) 2022-11-23T03:12:18.9284091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9284182Z p_assert( 2022-11-23T03:12:18.9284499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9284612Z traceback.print_stack() 2022-11-23T03:12:18.9284729Z File "", line 1, in 2022-11-23T03:12:18.9284927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9285056Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9285249Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9285385Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9285591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9285840Z self.run() 2022-11-23T03:12:18.9286200Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9286335Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9286660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9286780Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9287124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9287236Z getattr(self, test_name)() 2022-11-23T03:12:18.9287574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9287663Z fn() 2022-11-23T03:12:18.9288078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9288192Z test(self, **param_kwargs) 2022-11-23T03:12:18.9288583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9288701Z return func(*args, **kwargs) 2022-11-23T03:12:18.9288941Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9289042Z self.run_subtests( 2022-11-23T03:12:18.9289527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9289673Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9290008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9290150Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9290500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9290653Z output = model(*input) 2022-11-23T03:12:18.9290959Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9291087Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9291622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9291786Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9292139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9292246Z _lazy_init(state, module) 2022-11-23T03:12:18.9292584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9292716Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9293047Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9293160Z return func(*args, **kwargs) 2022-11-23T03:12:18.9293520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9293611Z p_assert( 2022-11-23T03:12:18.9293933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9294048Z traceback.print_stack() 2022-11-23T03:12:18.9294273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9294496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9294718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9294937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9295053Z File "", line 1, in 2022-11-23T03:12:18.9295255Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9295386Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9295733Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9295867Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9296061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9296150Z self.run() 2022-11-23T03:12:18.9296334Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9296461Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9296780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9296899Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9297287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9297400Z getattr(self, test_name)() 2022-11-23T03:12:18.9297736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9297823Z fn() 2022-11-23T03:12:18.9298162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9298265Z test(self, **param_kwargs) 2022-11-23T03:12:18.9298594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9298705Z return func(*args, **kwargs) 2022-11-23T03:12:18.9299113Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9299215Z self.run_subtests( 2022-11-23T03:12:18.9299605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9299759Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9300110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9300247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9300613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9300719Z output = model(*input) 2022-11-23T03:12:18.9301033Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9301162Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9301523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9301692Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9302047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9302153Z _lazy_init(state, module) 2022-11-23T03:12:18.9302489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9302781Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9303094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9303203Z return func(*args, **kwargs) 2022-11-23T03:12:18.9303735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9303828Z p_assert( 2022-11-23T03:12:18.9304391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9304506Z traceback.print_stack() 2022-11-23T03:12:18.9304630Z File "", line 1, in 2022-11-23T03:12:18.9304830Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9304962Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9305157Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9305295Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9305495Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9305580Z self.run() 2022-11-23T03:12:18.9305769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9305902Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9306229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9306350Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9306775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9306896Z getattr(self, test_name)() 2022-11-23T03:12:18.9307246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9307326Z fn() 2022-11-23T03:12:18.9307680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9307790Z test(self, **param_kwargs) 2022-11-23T03:12:18.9308133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9308244Z return func(*args, **kwargs) 2022-11-23T03:12:18.9308484Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9308586Z self.run_subtests( 2022-11-23T03:12:18.9308997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9309143Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9309494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9309637Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9309754Z File "", line 1, in 2022-11-23T03:12:18.9310118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9310226Z output = model(*input) 2022-11-23T03:12:18.9310537Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9310668Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9310864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9310996Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9311363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9311528Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9311716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9311855Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9312210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9312321Z _lazy_init(state, module) 2022-11-23T03:12:18.9312512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9312604Z self.run() 2022-11-23T03:12:18.9312943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9313076Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9313271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9313404Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9313729Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9313836Z return func(*args, **kwargs) 2022-11-23T03:12:18.9314201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9314291Z p_assert( 2022-11-23T03:12:18.9314612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9314724Z traceback.print_stack() 2022-11-23T03:12:18.9315046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9315166Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9315566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9315679Z getattr(self, test_name)() 2022-11-23T03:12:18.9316025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9316273Z fn() 2022-11-23T03:12:18.9316614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9316719Z test(self, **param_kwargs) 2022-11-23T03:12:18.9317046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9317155Z return func(*args, **kwargs) 2022-11-23T03:12:18.9317387Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9317480Z self.run_subtests( 2022-11-23T03:12:18.9317853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9318174Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9318525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9318666Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9319027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9319133Z output = model(*input) 2022-11-23T03:12:18.9319443Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9319566Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9319929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9320098Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9320454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9320564Z _lazy_init(state, module) 2022-11-23T03:12:18.9320901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9321031Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9321513Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9321617Z return func(*args, **kwargs) 2022-11-23T03:12:18.9321972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9322061Z p_assert( 2022-11-23T03:12:18.9322552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9322670Z traceback.print_stack() 2022-11-23T03:12:18.9322790Z File "", line 1, in 2022-11-23T03:12:18.9322988Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9323119Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9323302Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9323446Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9323645Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9323737Z self.run() 2022-11-23T03:12:18.9323926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9324059Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9324387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9324505Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9324905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9325023Z getattr(self, test_name)() 2022-11-23T03:12:18.9325370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9325456Z fn() 2022-11-23T03:12:18.9325972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9326251Z test(self, **param_kwargs) 2022-11-23T03:12:18.9326596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9326704Z return func(*args, **kwargs) 2022-11-23T03:12:18.9326944Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9327097Z self.run_subtests( 2022-11-23T03:12:18.9327440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9327592Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9327940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9328080Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9328440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9328541Z output = model(*input) 2022-11-23T03:12:18.9328855Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9329145Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9329495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9329662Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9330003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9330108Z _lazy_init(state, module) 2022-11-23T03:12:18.9330648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9330776Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9331105Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9331217Z return func(*args, **kwargs) 2022-11-23T03:12:18.9331581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9331670Z p_assert( 2022-11-23T03:12:18.9331990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9332109Z traceback.print_stack() 2022-11-23T03:12:18.9332338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9332558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9332780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9333000Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9333118Z File "", line 1, in 2022-11-23T03:12:18.9333315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9333604Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9333788Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9333923Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9334114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9334252Z self.run() 2022-11-23T03:12:18.9334445Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9334575Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9334889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9335006Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9335340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9335447Z getattr(self, test_name)() 2022-11-23T03:12:18.9335555Z File "", line 1, in 2022-11-23T03:12:18.9335890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9335973Z fn() 2022-11-23T03:12:18.9336162Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9336352Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9336695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9336803Z test(self, **param_kwargs) 2022-11-23T03:12:18.9336979Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9337115Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9337449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9337558Z return func(*args, **kwargs) 2022-11-23T03:12:18.9337753Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9337843Z self.run() 2022-11-23T03:12:18.9338076Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9338354Z self.run_subtests( 2022-11-23T03:12:18.9338543Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9338679Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9338800Z File "", line 1, in 2022-11-23T03:12:18.9339147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9339297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9339617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9339738Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9339935Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9340058Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9340410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9340558Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9340906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9341016Z getattr(self, test_name)() 2022-11-23T03:12:18.9341205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9341504Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9341854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9341952Z output = model(*input) 2022-11-23T03:12:18.9342461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9342551Z fn() 2022-11-23T03:12:18.9342751Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9342893Z self.run() 2022-11-23T03:12:18.9343283Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9343417Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9343762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9344086Z test(self, **param_kwargs) 2022-11-23T03:12:18.9344465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9344629Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9344821Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9344954Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9345294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9345487Z return func(*args, **kwargs) 2022-11-23T03:12:18.9345841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9345951Z _lazy_init(state, module) 2022-11-23T03:12:18.9346191Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9346294Z self.run_subtests( 2022-11-23T03:12:18.9346629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9346759Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9347083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9347203Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9347535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9347693Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9348023Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9348297Z return func(*args, **kwargs) 2022-11-23T03:12:18.9348633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9348741Z getattr(self, test_name)() 2022-11-23T03:12:18.9349077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9349213Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9349738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9349830Z p_assert( 2022-11-23T03:12:18.9350177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9350270Z fn() 2022-11-23T03:12:18.9350634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9350741Z output = model(*input) 2022-11-23T03:12:18.9351067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9351182Z traceback.print_stack() 2022-11-23T03:12:18.9351526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9351636Z test(self, **param_kwargs) 2022-11-23T03:12:18.9351946Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9352074Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9352416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9352534Z return func(*args, **kwargs) 2022-11-23T03:12:18.9352953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9353128Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9353363Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9353465Z self.run_subtests( 2022-11-23T03:12:18.9353819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9353927Z _lazy_init(state, module) 2022-11-23T03:12:18.9354415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9354560Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9354940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9355066Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9355395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9355531Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9355843Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9355953Z return func(*args, **kwargs) 2022-11-23T03:12:18.9356300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9356404Z output = model(*input) 2022-11-23T03:12:18.9356756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9356847Z p_assert( 2022-11-23T03:12:18.9357147Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9357275Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9357589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9357699Z traceback.print_stack() 2022-11-23T03:12:18.9358050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9358208Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9358549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9358653Z _lazy_init(state, module) 2022-11-23T03:12:18.9358972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9359102Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9359422Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9359535Z return func(*args, **kwargs) 2022-11-23T03:12:18.9359890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9359978Z p_assert( 2022-11-23T03:12:18.9360292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9360403Z traceback.print_stack() 2022-11-23T03:12:18.9360511Z File "", line 1, in 2022-11-23T03:12:18.9360701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9360826Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9361012Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9361145Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9361405Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9361680Z self.run() 2022-11-23T03:12:18.9361867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9362002Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9362329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9362450Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9362795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9362909Z getattr(self, test_name)() 2022-11-23T03:12:18.9363250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9363337Z fn() 2022-11-23T03:12:18.9363680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9363875Z test(self, **param_kwargs) 2022-11-23T03:12:18.9364223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9364336Z return func(*args, **kwargs) 2022-11-23T03:12:18.9364724Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9364823Z self.run_subtests( 2022-11-23T03:12:18.9365151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9365297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9365631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9365945Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9366316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9366423Z output = model(*input) 2022-11-23T03:12:18.9366737Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9366866Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9367227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9367391Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9367738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9367847Z _lazy_init(state, module) 2022-11-23T03:12:18.9368183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9368318Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9368645Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9368758Z return func(*args, **kwargs) 2022-11-23T03:12:18.9369123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9369215Z p_assert( 2022-11-23T03:12:18.9369532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9369647Z traceback.print_stack() 2022-11-23T03:12:18.9369871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9370095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9370316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9370536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9370705Z File "", line 1, in 2022-11-23T03:12:18.9370912Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9371035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9371227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9371367Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9371569Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9371667Z self.run() 2022-11-23T03:12:18.9371784Z File "", line 1, in 2022-11-23T03:12:18.9372132Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9372262Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9372575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9372739Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9372933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9373057Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9373394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9373501Z getattr(self, test_name)() 2022-11-23T03:12:18.9373681Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9373807Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9374143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9374225Z fn() 2022-11-23T03:12:18.9374418Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9374504Z self.run() 2022-11-23T03:12:18.9374849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9374957Z test(self, **param_kwargs) 2022-11-23T03:12:18.9375140Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9375263Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9375595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9375704Z return func(*args, **kwargs) 2022-11-23T03:12:18.9376016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9376130Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9376361Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9376459Z self.run_subtests( 2022-11-23T03:12:18.9376974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9377080Z getattr(self, test_name)() 2022-11-23T03:12:18.9377420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9377569Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9377915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9378001Z fn() 2022-11-23T03:12:18.9378352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9378491Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9378833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9378937Z test(self, **param_kwargs) 2022-11-23T03:12:18.9379346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9379463Z output = model(*input) 2022-11-23T03:12:18.9379969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9380078Z return func(*args, **kwargs) 2022-11-23T03:12:18.9380379Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9380504Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9380906Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9381004Z self.run_subtests( 2022-11-23T03:12:18.9381368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9381531Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9381924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9382076Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9382426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9382535Z _lazy_init(state, module) 2022-11-23T03:12:18.9382884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9383017Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9383353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9383485Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9384040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9384181Z output = model(*input) 2022-11-23T03:12:18.9384516Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9384629Z return func(*args, **kwargs) 2022-11-23T03:12:18.9384941Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9385064Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9385428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9385520Z p_assert( 2022-11-23T03:12:18.9385880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9386043Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9386365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9386486Z traceback.print_stack() 2022-11-23T03:12:18.9386839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9386942Z _lazy_init(state, module) 2022-11-23T03:12:18.9387279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9387411Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9387732Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9387890Z return func(*args, **kwargs) 2022-11-23T03:12:18.9388256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9388347Z p_assert( 2022-11-23T03:12:18.9388666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9388847Z traceback.print_stack() 2022-11-23T03:12:18.9388973Z File "", line 1, in 2022-11-23T03:12:18.9389169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9389299Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9389489Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9389628Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9389826Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9390067Z self.run() 2022-11-23T03:12:18.9390255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9390386Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9390704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9390884Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9391222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9391330Z getattr(self, test_name)() 2022-11-23T03:12:18.9391840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9391920Z fn() 2022-11-23T03:12:18.9392272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9392382Z test(self, **param_kwargs) 2022-11-23T03:12:18.9392721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9392834Z return func(*args, **kwargs) 2022-11-23T03:12:18.9393074Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9393181Z self.run_subtests( 2022-11-23T03:12:18.9393519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9393664Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9394012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9394150Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9394510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9394617Z output = model(*input) 2022-11-23T03:12:18.9394929Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9395058Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9395419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9395584Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9396095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9396201Z _lazy_init(state, module) 2022-11-23T03:12:18.9396526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9396652Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9396962Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9397071Z return func(*args, **kwargs) 2022-11-23T03:12:18.9397420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9397503Z p_assert( 2022-11-23T03:12:18.9397814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9397973Z traceback.print_stack() 2022-11-23T03:12:18.9398092Z File "", line 1, in 2022-11-23T03:12:18.9398286Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9398411Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9398593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9398726Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9398912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9399001Z self.run() 2022-11-23T03:12:18.9399184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9399489Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9399816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9400007Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9400360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9400465Z getattr(self, test_name)() 2022-11-23T03:12:18.9400813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9400898Z fn() 2022-11-23T03:12:18.9401251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9401362Z test(self, **param_kwargs) 2022-11-23T03:12:18.9401701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9401814Z return func(*args, **kwargs) 2022-11-23T03:12:18.9402056Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9402156Z self.run_subtests( 2022-11-23T03:12:18.9402494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9402646Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9402996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9403135Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9403495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9403601Z output = model(*input) 2022-11-23T03:12:18.9403914Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9404036Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9404397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9404567Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9404920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9405029Z _lazy_init(state, module) 2022-11-23T03:12:18.9405527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9405656Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9405967Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9406069Z return func(*args, **kwargs) 2022-11-23T03:12:18.9406607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9406697Z p_assert( 2022-11-23T03:12:18.9407024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9407184Z traceback.print_stack() 2022-11-23T03:12:18.9407416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9407639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9407860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9408074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9408191Z File "", line 1, in 2022-11-23T03:12:18.9408388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9408517Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9408705Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9408844Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9409254Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9409348Z self.run() 2022-11-23T03:12:18.9409526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9409657Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9409972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9410089Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9410424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9410533Z getattr(self, test_name)() 2022-11-23T03:12:18.9410646Z File "", line 1, in 2022-11-23T03:12:18.9410978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9411060Z fn() 2022-11-23T03:12:18.9411409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9411516Z test(self, **param_kwargs) 2022-11-23T03:12:18.9411704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9411827Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9412160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9412269Z return func(*args, **kwargs) 2022-11-23T03:12:18.9412444Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9412579Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9412810Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9412907Z self.run_subtests( 2022-11-23T03:12:18.9413104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9413196Z self.run() 2022-11-23T03:12:18.9413525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9413671Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9413778Z File "", line 1, in 2022-11-23T03:12:18.9413959Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9414088Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9414424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9414559Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9414749Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9414873Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9415241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9415361Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9415711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9415813Z output = model(*input) 2022-11-23T03:12:18.9415996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9416131Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9416430Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9416554Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9416887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9416989Z getattr(self, test_name)() 2022-11-23T03:12:18.9417231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9417323Z self.run() 2022-11-23T03:12:18.9417674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9417830Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9418162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9418245Z fn() 2022-11-23T03:12:18.9418600Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9418738Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9419092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9419201Z _lazy_init(state, module) 2022-11-23T03:12:18.9419551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9419670Z test(self, **param_kwargs) 2022-11-23T03:12:18.9419992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9420112Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9420441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9420572Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9420915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9421027Z return func(*args, **kwargs) 2022-11-23T03:12:18.9421529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9421638Z getattr(self, test_name)() 2022-11-23T03:12:18.9421948Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9422063Z return func(*args, **kwargs) 2022-11-23T03:12:18.9422289Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9422386Z self.run_subtests( 2022-11-23T03:12:18.9422717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9422800Z fn() 2022-11-23T03:12:18.9423342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9423432Z p_assert( 2022-11-23T03:12:18.9423768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9424138Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9424505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9424693Z test(self, **param_kwargs) 2022-11-23T03:12:18.9425027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9425141Z traceback.print_stack() 2022-11-23T03:12:18.9425490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9425629Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9425967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9426081Z return func(*args, **kwargs) 2022-11-23T03:12:18.9426433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9426542Z output = model(*input) 2022-11-23T03:12:18.9426846Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9426952Z self.run_subtests( 2022-11-23T03:12:18.9427269Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9427397Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9427735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9427883Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9428236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9428399Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9428749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9428862Z _lazy_init(state, module) 2022-11-23T03:12:18.9429214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9429355Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9429691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9429821Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9430177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9430287Z output = model(*input) 2022-11-23T03:12:18.9430651Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9430772Z return func(*args, **kwargs) 2022-11-23T03:12:18.9431084Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9431217Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9431586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9431678Z p_assert( 2022-11-23T03:12:18.9432034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9432198Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9432518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9432632Z traceback.print_stack() 2022-11-23T03:12:18.9432984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9433094Z _lazy_init(state, module) 2022-11-23T03:12:18.9433431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9433565Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9433931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9434050Z return func(*args, **kwargs) 2022-11-23T03:12:18.9434417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9434508Z p_assert( 2022-11-23T03:12:18.9434832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9434945Z traceback.print_stack() 2022-11-23T03:12:18.9435061Z File "", line 1, in 2022-11-23T03:12:18.9435258Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9435382Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9435572Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9435912Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9436109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9436198Z self.run() 2022-11-23T03:12:18.9436381Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9436509Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9436826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9436938Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9437275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9437386Z getattr(self, test_name)() 2022-11-23T03:12:18.9437718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9437801Z fn() 2022-11-23T03:12:18.9438147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9438254Z test(self, **param_kwargs) 2022-11-23T03:12:18.9438578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9438862Z return func(*args, **kwargs) 2022-11-23T03:12:18.9439102Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9439202Z self.run_subtests( 2022-11-23T03:12:18.9439537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9439687Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9440036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9440178Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9440536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9440645Z output = model(*input) 2022-11-23T03:12:18.9440961Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9441089Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9441449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9441612Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9442290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9442400Z _lazy_init(state, module) 2022-11-23T03:12:18.9442737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9442866Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9443246Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9443364Z return func(*args, **kwargs) 2022-11-23T03:12:18.9443732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9443822Z p_assert( 2022-11-23T03:12:18.9444143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9444256Z traceback.print_stack() 2022-11-23T03:12:18.9444475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9444698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9444921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9445191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9445309Z File "", line 1, in 2022-11-23T03:12:18.9445505Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9445635Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9445986Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9446115Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9446306Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9446395Z self.run() 2022-11-23T03:12:18.9446578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9446707Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9447028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9447149Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9447488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9447591Z getattr(self, test_name)() 2022-11-23T03:12:18.9447925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9448012Z fn() 2022-11-23T03:12:18.9448350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9448458Z test(self, **param_kwargs) 2022-11-23T03:12:18.9448789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9448898Z return func(*args, **kwargs) 2022-11-23T03:12:18.9449127Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9449224Z self.run_subtests( 2022-11-23T03:12:18.9449553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9449700Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9450035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9450170Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9450517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9450791Z output = model(*input) 2022-11-23T03:12:18.9451109Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9451232Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9451595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9451807Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9452170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9452280Z _lazy_init(state, module) 2022-11-23T03:12:18.9452614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9452745Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9453068Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9453175Z return func(*args, **kwargs) 2022-11-23T03:12:18.9453540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9453631Z p_assert( 2022-11-23T03:12:18.9454004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9454122Z traceback.print_stack() 2022-11-23T03:12:18.9454242Z File "", line 1, in 2022-11-23T03:12:18.9454597Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9454723Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9454899Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9455211Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9455414Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9455505Z self.run() 2022-11-23T03:12:18.9455693Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9455827Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9456152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9456275Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9456624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9456735Z getattr(self, test_name)() 2022-11-23T03:12:18.9457076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9457162Z fn() 2022-11-23T03:12:18.9457512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9457623Z test(self, **param_kwargs) 2022-11-23T03:12:18.9458130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9458235Z return func(*args, **kwargs) 2022-11-23T03:12:18.9458467Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9458569Z self.run_subtests( 2022-11-23T03:12:18.9458899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9459045Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9459382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9459518Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9459863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9459959Z output = model(*input) 2022-11-23T03:12:18.9460260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9460383Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9460733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9460936Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9461286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9461391Z _lazy_init(state, module) 2022-11-23T03:12:18.9461715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9461835Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9462128Z File "", line 1, in 2022-11-23T03:12:18.9462452Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9462564Z return func(*args, **kwargs) 2022-11-23T03:12:18.9462932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9463086Z p_assert( 2022-11-23T03:12:18.9463289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9463420Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9463740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9464048Z traceback.print_stack() 2022-11-23T03:12:18.9464256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9464395Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9464599Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9464690Z self.run() 2022-11-23T03:12:18.9465036Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9465164Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9465484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9465607Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9466115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9466227Z getattr(self, test_name)() 2022-11-23T03:12:18.9466572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9466658Z fn() 2022-11-23T03:12:18.9467008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9467112Z test(self, **param_kwargs) 2022-11-23T03:12:18.9467454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9467565Z return func(*args, **kwargs) 2022-11-23T03:12:18.9467804Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9467913Z self.run_subtests( 2022-11-23T03:12:18.9468251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9468401Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9468751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9468884Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9469247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9469354Z output = model(*input) 2022-11-23T03:12:18.9469666Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9469795Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9470223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9470401Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9470755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9470858Z _lazy_init(state, module) 2022-11-23T03:12:18.9471193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9471325Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9471649Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9471762Z return func(*args, **kwargs) 2022-11-23T03:12:18.9472125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9472375Z p_assert( 2022-11-23T03:12:18.9472759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9472864Z traceback.print_stack() 2022-11-23T03:12:18.9472977Z File "", line 1, in 2022-11-23T03:12:18.9473166Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9473291Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9473474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9473607Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9473800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9473889Z self.run() 2022-11-23T03:12:18.9474067Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9474195Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9474510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9474635Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9474970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9475076Z getattr(self, test_name)() 2022-11-23T03:12:18.9475411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9475494Z fn() 2022-11-23T03:12:18.9476011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9476123Z test(self, **param_kwargs) 2022-11-23T03:12:18.9476464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9476576Z return func(*args, **kwargs) 2022-11-23T03:12:18.9476816Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9476924Z self.run_subtests( 2022-11-23T03:12:18.9477259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9477402Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9477755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9477895Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9478253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9478359Z output = model(*input) 2022-11-23T03:12:18.9478672Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9478800Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9479209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9479540Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9479875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9479981Z _lazy_init(state, module) 2022-11-23T03:12:18.9480306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9480434Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9480748Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9480857Z return func(*args, **kwargs) 2022-11-23T03:12:18.9481209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9481348Z p_assert( 2022-11-23T03:12:18.9481658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9481770Z traceback.print_stack() 2022-11-23T03:12:18.9481986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9482200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9482412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9482804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9482923Z File "", line 1, in 2022-11-23T03:12:18.9483113Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9483244Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9483432Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9483576Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9483778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9483872Z self.run() 2022-11-23T03:12:18.9484061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9484195Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9484517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9484640Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9484985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9485095Z getattr(self, test_name)() 2022-11-23T03:12:18.9485436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9485529Z fn() 2022-11-23T03:12:18.9485881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9485991Z test(self, **param_kwargs) 2022-11-23T03:12:18.9486328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9486601Z return func(*args, **kwargs) 2022-11-23T03:12:18.9487019Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9487122Z self.run_subtests( 2022-11-23T03:12:18.9487460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9487609Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9488012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9488159Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9488563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9488677Z output = model(*input) 2022-11-23T03:12:18.9488991Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9489119Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9489483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9489649Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9490155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9490262Z _lazy_init(state, module) 2022-11-23T03:12:18.9490584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9490759Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9491078Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9491188Z return func(*args, **kwargs) 2022-11-23T03:12:18.9491543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9491633Z p_assert( 2022-11-23T03:12:18.9492124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9492243Z traceback.print_stack() 2022-11-23T03:12:18.9492353Z File "", line 1, in 2022-11-23T03:12:18.9492546Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9492676Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9492793Z File "", line 1, in 2022-11-23T03:12:18.9492988Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9493132Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9493333Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9493417Z self.run() 2022-11-23T03:12:18.9493615Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9493743Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9493934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9494067Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9494255Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9494392Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9494720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9494838Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9495038Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9495132Z self.run() 2022-11-23T03:12:18.9495479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9495590Z getattr(self, test_name)() 2022-11-23T03:12:18.9495780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9495915Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9496412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9496490Z fn() 2022-11-23T03:12:18.9496803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9496919Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9497308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9497421Z test(self, **param_kwargs) 2022-11-23T03:12:18.9497755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9497863Z getattr(self, test_name)() 2022-11-23T03:12:18.9498187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9498297Z return func(*args, **kwargs) 2022-11-23T03:12:18.9498629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9498711Z fn() 2022-11-23T03:12:18.9498941Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9499040Z self.run_subtests( 2022-11-23T03:12:18.9499379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9499536Z test(self, **param_kwargs) 2022-11-23T03:12:18.9500041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9500194Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9500534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9500647Z return func(*args, **kwargs) 2022-11-23T03:12:18.9500994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9501135Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9501377Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9501479Z self.run_subtests( 2022-11-23T03:12:18.9501843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9501954Z output = model(*input) 2022-11-23T03:12:18.9502293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9502442Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9502755Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9502883Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9503390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9503528Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9504071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9504250Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9504605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9504709Z output = model(*input) 2022-11-23T03:12:18.9505048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9505152Z _lazy_init(state, module) 2022-11-23T03:12:18.9505452Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9505576Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9505895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9506021Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9506368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9506804Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9507146Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9507258Z return func(*args, **kwargs) 2022-11-23T03:12:18.9507611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9507720Z _lazy_init(state, module) 2022-11-23T03:12:18.9508085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9508171Z p_assert( 2022-11-23T03:12:18.9508507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9508637Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9508957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9509139Z traceback.print_stack() 2022-11-23T03:12:18.9509625Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9509736Z return func(*args, **kwargs) 2022-11-23T03:12:18.9510081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9510168Z p_assert( 2022-11-23T03:12:18.9510479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9510590Z traceback.print_stack() 2022-11-23T03:12:18.9510703Z File "", line 1, in 2022-11-23T03:12:18.9510891Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9511017Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9511205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9511335Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9511533Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9511622Z self.run() 2022-11-23T03:12:18.9511804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9511933Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9512248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9512366Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9512702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9512803Z getattr(self, test_name)() 2022-11-23T03:12:18.9513133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9513221Z fn() 2022-11-23T03:12:18.9513563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9513672Z test(self, **param_kwargs) 2022-11-23T03:12:18.9514001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9514111Z return func(*args, **kwargs) 2022-11-23T03:12:18.9514336Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9514433Z self.run_subtests( 2022-11-23T03:12:18.9514758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9514904Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9515243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9515381Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9515774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9515885Z output = model(*input) 2022-11-23T03:12:18.9516191Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9516308Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9516656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9516814Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9517154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9517259Z _lazy_init(state, module) 2022-11-23T03:12:18.9517583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9517759Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9518075Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9518178Z return func(*args, **kwargs) 2022-11-23T03:12:18.9518531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9518618Z p_assert( 2022-11-23T03:12:18.9519114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9519228Z traceback.print_stack() 2022-11-23T03:12:18.9519454Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9519676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9519897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9520118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9520236Z File "", line 1, in 2022-11-23T03:12:18.9520433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9520562Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9520751Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9520888Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9521087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9521174Z self.run() 2022-11-23T03:12:18.9521363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9521498Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9521829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9521958Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9522304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9522414Z getattr(self, test_name)() 2022-11-23T03:12:18.9522759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9522838Z fn() 2022-11-23T03:12:18.9523524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9523636Z test(self, **param_kwargs) 2022-11-23T03:12:18.9523978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9524091Z return func(*args, **kwargs) 2022-11-23T03:12:18.9524331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9524484Z self.run_subtests( 2022-11-23T03:12:18.9524832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9524976Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9525326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9525465Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9525824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9525931Z output = model(*input) 2022-11-23T03:12:18.9526242Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9526371Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9527145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9527307Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9527660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9527769Z _lazy_init(state, module) 2022-11-23T03:12:18.9528106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9528236Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9528558Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9528670Z return func(*args, **kwargs) 2022-11-23T03:12:18.9529040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9529130Z p_assert( 2022-11-23T03:12:18.9529455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9529570Z traceback.print_stack() 2022-11-23T03:12:18.9529689Z File "", line 1, in 2022-11-23T03:12:18.9530041Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9530168Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9530351Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9530486Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9530715Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9530808Z self.run() 2022-11-23T03:12:18.9530992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9531299Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9531422Z File "", line 1, in 2022-11-23T03:12:18.9531758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9531880Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9532219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9532330Z getattr(self, test_name)() 2022-11-23T03:12:18.9532526Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9532655Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9532998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9533085Z fn() 2022-11-23T03:12:18.9533273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9533411Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9533806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9533926Z test(self, **param_kwargs) 2022-11-23T03:12:18.9534283Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9534372Z self.run() 2022-11-23T03:12:18.9534706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9534816Z return func(*args, **kwargs) 2022-11-23T03:12:18.9534999Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9535122Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9535356Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9535454Z self.run_subtests( 2022-11-23T03:12:18.9535767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9535945Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9536275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9536422Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9536755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9536856Z getattr(self, test_name)() 2022-11-23T03:12:18.9537193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9537328Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9537657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9537740Z fn() 2022-11-23T03:12:18.9538086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9538199Z output = model(*input) 2022-11-23T03:12:18.9538724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9538829Z test(self, **param_kwargs) 2022-11-23T03:12:18.9539140Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9539270Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9539614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9539726Z return func(*args, **kwargs) 2022-11-23T03:12:18.9540087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9540249Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9540495Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9540592Z self.run_subtests( 2022-11-23T03:12:18.9540946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9541055Z _lazy_init(state, module) 2022-11-23T03:12:18.9541391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9541542Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9541878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9542007Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9542508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9542644Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9543177Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9543297Z return func(*args, **kwargs) 2022-11-23T03:12:18.9543662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9543752Z p_assert( 2022-11-23T03:12:18.9544322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9544431Z output = model(*input) 2022-11-23T03:12:18.9544757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9544864Z traceback.print_stack() 2022-11-23T03:12:18.9545174Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9545302Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9545747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9545911Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9546415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9546522Z _lazy_init(state, module) 2022-11-23T03:12:18.9546844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9546969Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9547275Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9547383Z return func(*args, **kwargs) 2022-11-23T03:12:18.9547735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9547826Z p_assert( 2022-11-23T03:12:18.9548140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9548250Z traceback.print_stack() 2022-11-23T03:12:18.9548365Z File "", line 1, in 2022-11-23T03:12:18.9548548Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9548674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9548857Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9548991Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9549184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9549273Z self.run() 2022-11-23T03:12:18.9549455Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9549584Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9549900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9550017Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9550353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9550460Z getattr(self, test_name)() 2022-11-23T03:12:18.9550789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9550872Z fn() 2022-11-23T03:12:18.9551391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9551504Z test(self, **param_kwargs) 2022-11-23T03:12:18.9551841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9551953Z return func(*args, **kwargs) 2022-11-23T03:12:18.9552258Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9552367Z self.run_subtests( 2022-11-23T03:12:18.9552711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9552864Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9553215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9553356Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9553710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9553817Z output = model(*input) 2022-11-23T03:12:18.9554129Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9554258Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9554835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9554996Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9555339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9555715Z _lazy_init(state, module) 2022-11-23T03:12:18.9556046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9556177Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9556502Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9556615Z return func(*args, **kwargs) 2022-11-23T03:12:18.9556978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9557074Z p_assert( 2022-11-23T03:12:18.9557400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9557516Z traceback.print_stack() 2022-11-23T03:12:18.9557735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9557961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9558182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9558400Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9558518Z File "", line 1, in 2022-11-23T03:12:18.9558715Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9558843Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9559029Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9559169Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9559369Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9559460Z self.run() 2022-11-23T03:12:18.9559650Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9559783Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9560112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9560232Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9560732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9560839Z getattr(self, test_name)() 2022-11-23T03:12:18.9561172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9561259Z fn() 2022-11-23T03:12:18.9561647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9561762Z test(self, **param_kwargs) 2022-11-23T03:12:18.9562094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9562204Z return func(*args, **kwargs) 2022-11-23T03:12:18.9562614Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9562717Z self.run_subtests( 2022-11-23T03:12:18.9563056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9563208Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9563556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9563741Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9564107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9564216Z output = model(*input) 2022-11-23T03:12:18.9564527Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9564656Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9565019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9565183Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9565799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9565909Z _lazy_init(state, module) 2022-11-23T03:12:18.9566236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9566370Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9566863Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9566981Z return func(*args, **kwargs) 2022-11-23T03:12:18.9567348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9567440Z p_assert( 2022-11-23T03:12:18.9567761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9567875Z traceback.print_stack() 2022-11-23T03:12:18.9567995Z File "", line 1, in 2022-11-23T03:12:18.9568192Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9568316Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9568513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9568656Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9568856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9568947Z self.run() 2022-11-23T03:12:18.9569135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9569267Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9569586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9569708Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9570051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9570161Z getattr(self, test_name)() 2022-11-23T03:12:18.9570503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9570594Z fn() 2022-11-23T03:12:18.9570993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9571111Z test(self, **param_kwargs) 2022-11-23T03:12:18.9571449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9571561Z return func(*args, **kwargs) 2022-11-23T03:12:18.9571800Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9571901Z self.run_subtests( 2022-11-23T03:12:18.9572239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9572387Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9573070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9573266Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9573626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9573735Z output = model(*input) 2022-11-23T03:12:18.9573851Z File "", line 1, in 2022-11-23T03:12:18.9574165Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9574293Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9574655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9574818Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9575014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9575135Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9575498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9575608Z _lazy_init(state, module) 2022-11-23T03:12:18.9575956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9576091Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9576416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9576542Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9576736Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9576819Z self.run() 2022-11-23T03:12:18.9577132Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9577240Z return func(*args, **kwargs) 2022-11-23T03:12:18.9577425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9577564Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9577920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9578007Z p_assert( 2022-11-23T03:12:18.9578318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9578428Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9578739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9578846Z traceback.print_stack() 2022-11-23T03:12:18.9579362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9579475Z getattr(self, test_name)() 2022-11-23T03:12:18.9579820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9579911Z fn() 2022-11-23T03:12:18.9580303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9580420Z test(self, **param_kwargs) 2022-11-23T03:12:18.9580763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9580879Z return func(*args, **kwargs) 2022-11-23T03:12:18.9581118Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9581220Z self.run_subtests( 2022-11-23T03:12:18.9581556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9581705Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9582207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9582390Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9582745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9582849Z output = model(*input) 2022-11-23T03:12:18.9583322Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9583453Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9583815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9584178Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9584541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9584645Z _lazy_init(state, module) 2022-11-23T03:12:18.9584990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9585121Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9585442Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9585555Z return func(*args, **kwargs) 2022-11-23T03:12:18.9585920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9586012Z p_assert( 2022-11-23T03:12:18.9586336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9586444Z traceback.print_stack() 2022-11-23T03:12:18.9586561Z File "", line 1, in 2022-11-23T03:12:18.9586917Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9587043Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9587417Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9587556Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9587757Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9587844Z self.run() 2022-11-23T03:12:18.9588088Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9588227Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9588551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9588671Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9589016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9589127Z getattr(self, test_name)() 2022-11-23T03:12:18.9589470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9589624Z fn() 2022-11-23T03:12:18.9589987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9590098Z test(self, **param_kwargs) 2022-11-23T03:12:18.9590592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9590702Z return func(*args, **kwargs) 2022-11-23T03:12:18.9590933Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9591030Z self.run_subtests( 2022-11-23T03:12:18.9591354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9591493Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9591829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9592041Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9592578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9592686Z output = model(*input) 2022-11-23T03:12:18.9592996Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9593126Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9593488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9593645Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9593995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9594104Z _lazy_init(state, module) 2022-11-23T03:12:18.9594452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9594585Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9594909Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9595021Z return func(*args, **kwargs) 2022-11-23T03:12:18.9595385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9595469Z p_assert( 2022-11-23T03:12:18.9595792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9595905Z traceback.print_stack() 2022-11-23T03:12:18.9596130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9596352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9596732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9596944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9597059Z File "", line 1, in 2022-11-23T03:12:18.9597244Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9597369Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9597551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9597684Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9597878Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9597967Z self.run() 2022-11-23T03:12:18.9598151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9598274Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9598644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9598767Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9599104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9599211Z getattr(self, test_name)() 2022-11-23T03:12:18.9599542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9599625Z fn() 2022-11-23T03:12:18.9599962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9600241Z test(self, **param_kwargs) 2022-11-23T03:12:18.9600586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9600698Z return func(*args, **kwargs) 2022-11-23T03:12:18.9600988Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9601091Z self.run_subtests( 2022-11-23T03:12:18.9601430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9601581Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9601931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9602067Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9602427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9602536Z output = model(*input) 2022-11-23T03:12:18.9602850Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9602982Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9603349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9603516Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9604209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9604314Z _lazy_init(state, module) 2022-11-23T03:12:18.9604651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9604783Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9605107Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9605220Z return func(*args, **kwargs) 2022-11-23T03:12:18.9605583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9605679Z p_assert( 2022-11-23T03:12:18.9606003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9606110Z traceback.print_stack() 2022-11-23T03:12:18.9606227Z File "", line 1, in 2022-11-23T03:12:18.9606425Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9606555Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9606746Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9606887Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9607088Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9607180Z self.run() 2022-11-23T03:12:18.9607366Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9607501Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9607880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9608007Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9608354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9608464Z getattr(self, test_name)() 2022-11-23T03:12:18.9608807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9608887Z fn() 2022-11-23T03:12:18.9609239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9609349Z test(self, **param_kwargs) 2022-11-23T03:12:18.9609464Z File "", line 1, in 2022-11-23T03:12:18.9609809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9609972Z return func(*args, **kwargs) 2022-11-23T03:12:18.9610377Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9610476Z self.run_subtests( 2022-11-23T03:12:18.9610659Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9610785Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9611298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9611450Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9611640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9611781Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9612130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9612277Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9612474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9612567Z self.run() 2022-11-23T03:12:18.9612929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9613036Z output = model(*input) 2022-11-23T03:12:18.9613228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9613363Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9613678Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9613811Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9614131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9614252Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9614623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9614790Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9615137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9615248Z getattr(self, test_name)() 2022-11-23T03:12:18.9615599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9615709Z _lazy_init(state, module) 2022-11-23T03:12:18.9616048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9616134Z fn() 2022-11-23T03:12:18.9616630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9616761Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9617141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9617254Z test(self, **param_kwargs) 2022-11-23T03:12:18.9617568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9617677Z return func(*args, **kwargs) 2022-11-23T03:12:18.9618006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9618115Z return func(*args, **kwargs) 2022-11-23T03:12:18.9618469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9618556Z p_assert( 2022-11-23T03:12:18.9618787Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9618932Z self.run_subtests( 2022-11-23T03:12:18.9619436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9619545Z traceback.print_stack() 2022-11-23T03:12:18.9619885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9620034Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9620382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9620523Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9620882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9620992Z output = model(*input) 2022-11-23T03:12:18.9621301Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9621436Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9621795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9621963Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9622468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9622576Z _lazy_init(state, module) 2022-11-23T03:12:18.9623090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9623222Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9623546Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9623660Z return func(*args, **kwargs) 2022-11-23T03:12:18.9624233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9624342Z p_assert( 2022-11-23T03:12:18.9624673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9624787Z traceback.print_stack() 2022-11-23T03:12:18.9624904Z File "", line 1, in 2022-11-23T03:12:18.9625101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9625229Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9625413Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9625550Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9625750Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9625841Z self.run() 2022-11-23T03:12:18.9626031Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9626169Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9626562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9626695Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9627197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9627480Z getattr(self, test_name)() 2022-11-23T03:12:18.9627827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9627914Z fn() 2022-11-23T03:12:18.9628265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9628377Z test(self, **param_kwargs) 2022-11-23T03:12:18.9628718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9628893Z return func(*args, **kwargs) 2022-11-23T03:12:18.9629132Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9629238Z self.run_subtests( 2022-11-23T03:12:18.9629581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9629731Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9630081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9630381Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9630773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9630880Z output = model(*input) 2022-11-23T03:12:18.9631177Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9631306Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9631845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9632010Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9632360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9632470Z _lazy_init(state, module) 2022-11-23T03:12:18.9632807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9632936Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9633252Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9633364Z return func(*args, **kwargs) 2022-11-23T03:12:18.9633727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9633824Z p_assert( 2022-11-23T03:12:18.9634148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9634261Z traceback.print_stack() 2022-11-23T03:12:18.9634484Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9634864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9635072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9635286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9635400Z File "", line 1, in 2022-11-23T03:12:18.9635590Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9635715Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9635944Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9636083Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9636268Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9636358Z self.run() 2022-11-23T03:12:18.9636541Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9636669Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9636988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9637103Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9637438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9637545Z getattr(self, test_name)() 2022-11-23T03:12:18.9637872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9638009Z fn() 2022-11-23T03:12:18.9638351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9638456Z test(self, **param_kwargs) 2022-11-23T03:12:18.9638965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9639077Z return func(*args, **kwargs) 2022-11-23T03:12:18.9639317Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9639419Z self.run_subtests( 2022-11-23T03:12:18.9639751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9639903Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9640250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9640398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9640758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9640865Z output = model(*input) 2022-11-23T03:12:18.9641178Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9641306Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9641662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9641828Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9641945Z File "", line 1, in 2022-11-23T03:12:18.9642295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9642566Z _lazy_init(state, module) 2022-11-23T03:12:18.9642948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9643084Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9643275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9643571Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9643899Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9644012Z return func(*args, **kwargs) 2022-11-23T03:12:18.9644200Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9644338Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9644701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9644798Z p_assert( 2022-11-23T03:12:18.9645049Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9645142Z self.run() 2022-11-23T03:12:18.9645469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9645583Z traceback.print_stack() 2022-11-23T03:12:18.9645771Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9645908Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9646229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9646351Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9646692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9646804Z getattr(self, test_name)() 2022-11-23T03:12:18.9647150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9647330Z fn() 2022-11-23T03:12:18.9647684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9647795Z test(self, **param_kwargs) 2022-11-23T03:12:18.9648136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9648247Z return func(*args, **kwargs) 2022-11-23T03:12:18.9648639Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9648738Z self.run_subtests( 2022-11-23T03:12:18.9649067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9649210Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9649546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9649858Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9650220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9650327Z output = model(*input) 2022-11-23T03:12:18.9650633Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9650762Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9651126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9651288Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9651640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9651752Z _lazy_init(state, module) 2022-11-23T03:12:18.9652092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9652223Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9652548Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9652654Z return func(*args, **kwargs) 2022-11-23T03:12:18.9653022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9653112Z p_assert( 2022-11-23T03:12:18.9653434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9653548Z traceback.print_stack() 2022-11-23T03:12:18.9653666Z File "", line 1, in 2022-11-23T03:12:18.9653861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9653988Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9654222Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9654369Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9654568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9654660Z self.run() 2022-11-23T03:12:18.9654851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9655146Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9655459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9655571Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9655905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9656017Z getattr(self, test_name)() 2022-11-23T03:12:18.9656415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9656498Z fn() 2022-11-23T03:12:18.9656839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9656947Z test(self, **param_kwargs) 2022-11-23T03:12:18.9657276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9657378Z return func(*args, **kwargs) 2022-11-23T03:12:18.9657610Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9657709Z self.run_subtests( 2022-11-23T03:12:18.9658036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9658182Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9658532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9658670Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9659018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9659116Z output = model(*input) 2022-11-23T03:12:18.9659416Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9659542Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9659893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9660050Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9660388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9660497Z _lazy_init(state, module) 2022-11-23T03:12:18.9660824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9660946Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9661261Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9661368Z return func(*args, **kwargs) 2022-11-23T03:12:18.9661719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9661806Z p_assert( 2022-11-23T03:12:18.9662116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9662225Z traceback.print_stack() 2022-11-23T03:12:18.9662331Z File "", line 1, in 2022-11-23T03:12:18.9662521Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9662826Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9663066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9663214Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9663415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9663506Z self.run() 2022-11-23T03:12:18.9663697Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9663824Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9664374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9664498Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9664847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9664961Z getattr(self, test_name)() 2022-11-23T03:12:18.9665392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9665481Z fn() 2022-11-23T03:12:18.9665982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9666085Z test(self, **param_kwargs) 2022-11-23T03:12:18.9666419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9666528Z return func(*args, **kwargs) 2022-11-23T03:12:18.9666764Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9666862Z self.run_subtests( 2022-11-23T03:12:18.9667375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9667525Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9667883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9668019Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9668382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9668490Z output = model(*input) 2022-11-23T03:12:18.9668801Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9668930Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9669289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9669452Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9669810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9669917Z _lazy_init(state, module) 2022-11-23T03:12:18.9670256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9670387Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9670713Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9670825Z return func(*args, **kwargs) 2022-11-23T03:12:18.9671194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9671284Z p_assert( 2022-11-23T03:12:18.9671607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9671715Z traceback.print_stack() 2022-11-23T03:12:18.9671942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9672165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9672450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9672679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9672798Z File "", line 1, in 2022-11-23T03:12:18.9673154Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9673281Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9673456Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9673592Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9673787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9673877Z self.run() 2022-11-23T03:12:18.9674060Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9674237Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9674559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9674670Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9675007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9675116Z getattr(self, test_name)() 2022-11-23T03:12:18.9675447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9675531Z fn() 2022-11-23T03:12:18.9675869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9675977Z test(self, **param_kwargs) 2022-11-23T03:12:18.9676307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9676414Z return func(*args, **kwargs) 2022-11-23T03:12:18.9676649Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9676747Z self.run_subtests( 2022-11-23T03:12:18.9677072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9677396Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9677746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9677888Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9678248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9678349Z output = model(*input) 2022-11-23T03:12:18.9678662Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9678796Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9679163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9679330Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9679682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9679791Z _lazy_init(state, module) 2022-11-23T03:12:18.9680291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9680411Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9680728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9680839Z return func(*args, **kwargs) 2022-11-23T03:12:18.9681366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9681509Z p_assert( 2022-11-23T03:12:18.9681845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9681959Z traceback.print_stack() 2022-11-23T03:12:18.9682078Z File "", line 1, in 2022-11-23T03:12:18.9682268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9682403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9682593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9682732Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9682933Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9683025Z self.run() 2022-11-23T03:12:18.9683216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9683398Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9683720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9683844Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9684193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9684306Z getattr(self, test_name)() 2022-11-23T03:12:18.9684425Z File "", line 1, in 2022-11-23T03:12:18.9684769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9684855Z fn() 2022-11-23T03:12:18.9685200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9685314Z test(self, **param_kwargs) 2022-11-23T03:12:18.9685511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9685648Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9685992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9686104Z return func(*args, **kwargs) 2022-11-23T03:12:18.9686293Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9686431Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9686663Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9686763Z self.run_subtests( 2022-11-23T03:12:18.9686965Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9687056Z self.run() 2022-11-23T03:12:18.9687730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9687981Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9688179Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9688314Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9688657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9688798Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9689117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9689237Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9689596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9689704Z output = model(*input) 2022-11-23T03:12:18.9690049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9690165Z getattr(self, test_name)() 2022-11-23T03:12:18.9690518Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9690813Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9691148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9691231Z fn() 2022-11-23T03:12:18.9691576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9691734Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9692069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9692175Z test(self, **param_kwargs) 2022-11-23T03:12:18.9692508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9692839Z _lazy_init(state, module) 2022-11-23T03:12:18.9693190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9693304Z return func(*args, **kwargs) 2022-11-23T03:12:18.9693643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9693778Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9694019Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9694121Z self.run_subtests( 2022-11-23T03:12:18.9694438Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9694551Z return func(*args, **kwargs) 2022-11-23T03:12:18.9694890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9695047Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9695413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9695503Z p_assert( 2022-11-23T03:12:18.9695852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9695993Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9696309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9696423Z traceback.print_stack() 2022-11-23T03:12:18.9696940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9697047Z output = model(*input) 2022-11-23T03:12:18.9697349Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9697481Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9697832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9697990Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9698323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9698430Z _lazy_init(state, module) 2022-11-23T03:12:18.9698757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9698883Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9699193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9699299Z return func(*args, **kwargs) 2022-11-23T03:12:18.9699703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9699798Z p_assert( 2022-11-23T03:12:18.9700105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9700214Z traceback.print_stack() 2022-11-23T03:12:18.9700328Z File "", line 1, in 2022-11-23T03:12:18.9700516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9700641Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9701006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9701146Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9701340Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9701434Z self.run() 2022-11-23T03:12:18.9701622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9701808Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9702138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9702260Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9702606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9702717Z getattr(self, test_name)() 2022-11-23T03:12:18.9703054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9703141Z fn() 2022-11-23T03:12:18.9703490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9703599Z test(self, **param_kwargs) 2022-11-23T03:12:18.9704148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9704275Z return func(*args, **kwargs) 2022-11-23T03:12:18.9704520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9704629Z self.run_subtests( 2022-11-23T03:12:18.9704963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9705113Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9705464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9705602Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9705964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9706072Z output = model(*input) 2022-11-23T03:12:18.9706383Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9706518Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9706879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9707043Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9707397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9707505Z _lazy_init(state, module) 2022-11-23T03:12:18.9707843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9707973Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9708296Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9708409Z return func(*args, **kwargs) 2022-11-23T03:12:18.9708841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9708942Z p_assert( 2022-11-23T03:12:18.9709269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9709382Z traceback.print_stack() 2022-11-23T03:12:18.9709606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9709828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9710048Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9710432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9710636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9710848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9711118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9711328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9711535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9711741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9711946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9712153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9712352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9712561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9712769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9712978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9713186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9713388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9713593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9713796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9714001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9714200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9714403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9714613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9714816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9715017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9715220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9715422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9716151Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9716914Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9717633Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9718337Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9719038Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9719987Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9720712Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9721436Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:18.9721659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9721881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9722098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9722315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9722691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9722896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9723104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9723317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9723525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9723724Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9723930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9724318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9724532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9724744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9724957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9725170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9725431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9725643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9725856Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9726066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9726276Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9726489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9726700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9726910Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9727164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9727544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9727918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9728131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9728341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9728554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9728765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9728974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9729184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9729402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9729608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9729818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9730029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9730240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9730451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9730870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9731077Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9731282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9731487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9731690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9732073Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9732285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9732499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9732711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9732922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9733133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9733345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9733602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9733820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9734030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9734240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9734452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9734662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9735035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9735240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9735482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9735693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9735898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9736103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9736306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9736508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9736710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9736912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9737112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9737214Z dist init r=0, world=4 2022-11-23T03:12:18.9737524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9737821Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9738110Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9738394Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9738674Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9738960Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9739424Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9739712Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9739999Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9740281Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9740621Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9740921Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:18.9741020Z dist init r=1, world=4 2022-11-23T03:12:18.9741331Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9741633Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9741929Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9742222Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9742559Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9743179Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9743470Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9743759Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9744240Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9744546Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9744834Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9745121Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:18.9745221Z dist init r=3, world=4 2022-11-23T03:12:18.9745533Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9745821Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9746118Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9746405Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9746692Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9747139Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9747411Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9747755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9748048Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9748326Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9748603Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9748881Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:18.9748976Z dist init r=2, world=4 2022-11-23T03:12:18.9749271Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9749615Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9749900Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9750181Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9750454Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9750737Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9751022Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9751301Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9751579Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9752034Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9752324Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9752619Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:18.9752710Z ok (6.924s) 2022-11-23T03:12:18.9753038Z test_nested_wrapped_model_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25871 2022-11-23T03:12:18.9753248Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25872 2022-11-23T03:12:18.9753447Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25873 2022-11-23T03:12:18.9753650Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25874 2022-11-23T03:12:18.9754025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9754190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9754607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9754793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9755148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9755649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9756015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9756188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9756541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9756704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9757067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9757296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9757651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:18.9757814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:18.9758178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:18.9758347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:18.9758734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:18.9758960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:18.9759181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:18.9759408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:18.9759784Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9760156Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9760521Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9760881Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:18.9761085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:18.9761294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:18.9761508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:18.9761712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:18.9761926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9762139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9762349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9762558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9763768Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9763882Z warnings.warn( 2022-11-23T03:12:18.9764883Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9764977Z warnings.warn( 2022-11-23T03:12:18.9765962Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9766263Z warnings.warn( 2022-11-23T03:12:18.9767434Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:18.9767532Z warnings.warn( 2022-11-23T03:12:18.9767651Z File "", line 1, in 2022-11-23T03:12:18.9767853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9767989Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9768184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9768323Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9768517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9768613Z self.run() 2022-11-23T03:12:18.9768805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9768939Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9769060Z File "", line 1, in 2022-11-23T03:12:18.9769391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9769513Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9769864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9769969Z getattr(self, test_name)() 2022-11-23T03:12:18.9770174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9770306Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9770656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9770746Z fn() 2022-11-23T03:12:18.9770936Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9771075Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9771427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9771532Z test(self, **param_kwargs) 2022-11-23T03:12:18.9771737Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9771828Z self.run() 2022-11-23T03:12:18.9772174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9772337Z return func(*args, **kwargs) 2022-11-23T03:12:18.9772536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9772672Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9772906Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9773009Z self.run_subtests( 2022-11-23T03:12:18.9773495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9773612Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9773940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9774085Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9774418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9774588Z getattr(self, test_name)() 2022-11-23T03:12:18.9774921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9775058Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9775389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9775471Z fn() 2022-11-23T03:12:18.9775821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9775925Z output = model(*input) 2022-11-23T03:12:18.9776456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9776567Z test(self, **param_kwargs) 2022-11-23T03:12:18.9776874Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9777011Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9777358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9777470Z return func(*args, **kwargs) 2022-11-23T03:12:18.9777833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9777999Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9778240Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9778342Z self.run_subtests( 2022-11-23T03:12:18.9778691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9778801Z _lazy_init(state, module) 2022-11-23T03:12:18.9779141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9779299Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9779799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9779926Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9780264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9780399Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9780712Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9780816Z return func(*args, **kwargs) 2022-11-23T03:12:18.9781163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9781267Z output = model(*input) 2022-11-23T03:12:18.9781700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9781795Z p_assert( 2022-11-23T03:12:18.9782099Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9782222Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9782525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9782636Z traceback.print_stack() 2022-11-23T03:12:18.9783165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9783328Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9783680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9783840Z _lazy_init(state, module) 2022-11-23T03:12:18.9784395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9784528Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9784852Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9784958Z return func(*args, **kwargs) 2022-11-23T03:12:18.9785322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9785411Z p_assert( 2022-11-23T03:12:18.9785736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9785852Z traceback.print_stack() 2022-11-23T03:12:18.9785968Z File "", line 1, in 2022-11-23T03:12:18.9786083Z File "", line 1, in 2022-11-23T03:12:18.9786277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9786415Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9786608Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9786748Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9786945Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9787077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9787277Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9787370Z self.run() 2022-11-23T03:12:18.9787707Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9787842Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9788241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9788380Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9788587Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9788681Z self.run() 2022-11-23T03:12:18.9789011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9789126Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9789317Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9789450Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9789802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9789915Z getattr(self, test_name)() 2022-11-23T03:12:18.9790237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9790357Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9790701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9790853Z fn() 2022-11-23T03:12:18.9791368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9791477Z getattr(self, test_name)() 2022-11-23T03:12:18.9791817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9791922Z test(self, **param_kwargs) 2022-11-23T03:12:18.9792249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9792333Z fn() 2022-11-23T03:12:18.9792663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9792766Z return func(*args, **kwargs) 2022-11-23T03:12:18.9793285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9793465Z test(self, **param_kwargs) 2022-11-23T03:12:18.9793708Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9793810Z self.run_subtests( 2022-11-23T03:12:18.9794152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9794266Z return func(*args, **kwargs) 2022-11-23T03:12:18.9794604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9794748Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9794987Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9795089Z self.run_subtests( 2022-11-23T03:12:18.9795441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9795589Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9795926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9796077Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9796435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9796540Z output = model(*input) 2022-11-23T03:12:18.9796888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9797028Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9797490Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9797616Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9797974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9798081Z output = model(*input) 2022-11-23T03:12:18.9798436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9798589Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9798892Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9799015Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9799354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9799459Z _lazy_init(state, module) 2022-11-23T03:12:18.9799805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9799964Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9800336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9800464Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9800803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9800907Z _lazy_init(state, module) 2022-11-23T03:12:18.9801401Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9801514Z return func(*args, **kwargs) 2022-11-23T03:12:18.9801848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9801978Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9802344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9802475Z p_assert( 2022-11-23T03:12:18.9802808Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9802921Z return func(*args, **kwargs) 2022-11-23T03:12:18.9803244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9803360Z traceback.print_stack() 2022-11-23T03:12:18.9803718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9803807Z p_assert( 2022-11-23T03:12:18.9804122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9804229Z traceback.print_stack() 2022-11-23T03:12:18.9804617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9804837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9805056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9805266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9805380Z File "", line 1, in 2022-11-23T03:12:18.9805493Z File "", line 1, in 2022-11-23T03:12:18.9805684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9805804Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9805992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9806115Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9806299Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9806432Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9806618Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9806754Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9806940Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9807030Z self.run() 2022-11-23T03:12:18.9807219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9807306Z self.run() 2022-11-23T03:12:18.9807490Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9807620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9807985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9808120Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9808446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9808573Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9808942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9809070Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9809418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9809529Z getattr(self, test_name)() 2022-11-23T03:12:18.9809871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9809982Z getattr(self, test_name)() 2022-11-23T03:12:18.9810318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9810404Z fn() 2022-11-23T03:12:18.9810908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9810990Z fn() 2022-11-23T03:12:18.9811389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9811496Z test(self, **param_kwargs) 2022-11-23T03:12:18.9811830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9811931Z test(self, **param_kwargs) 2022-11-23T03:12:18.9812262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9812371Z return func(*args, **kwargs) 2022-11-23T03:12:18.9812702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9812810Z return func(*args, **kwargs) 2022-11-23T03:12:18.9813040Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9813138Z self.run_subtests( 2022-11-23T03:12:18.9813373Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9813464Z self.run_subtests( 2022-11-23T03:12:18.9813789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9813933Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9814260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9814403Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9814739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9814876Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9815208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9815339Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9815691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9815796Z output = model(*input) 2022-11-23T03:12:18.9816145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9816247Z output = model(*input) 2022-11-23T03:12:18.9816547Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9816671Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9816971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9817089Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9817442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9817647Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9818003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9818160Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9818500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9818604Z _lazy_init(state, module) 2022-11-23T03:12:18.9818943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9819049Z _lazy_init(state, module) 2022-11-23T03:12:18.9819369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9819496Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9819877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9820181Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9820509Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9820624Z return func(*args, **kwargs) 2022-11-23T03:12:18.9820947Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9821060Z return func(*args, **kwargs) 2022-11-23T03:12:18.9821423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9821515Z p_assert( 2022-11-23T03:12:18.9821881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9821970Z p_assert( 2022-11-23T03:12:18.9822298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9822416Z traceback.print_stack() 2022-11-23T03:12:18.9822739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9822844Z traceback.print_stack() 2022-11-23T03:12:18.9822963Z File "", line 1, in 2022-11-23T03:12:18.9823161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9823452Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9823636Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9823768Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9824162Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9824260Z self.run() 2022-11-23T03:12:18.9824619Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9824761Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9825098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9825218Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9825563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9825673Z getattr(self, test_name)() 2022-11-23T03:12:18.9826017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9826102Z fn() 2022-11-23T03:12:18.9826444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9826554Z test(self, **param_kwargs) 2022-11-23T03:12:18.9826895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9827015Z return func(*args, **kwargs) 2022-11-23T03:12:18.9827323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9827434Z self.run_subtests( 2022-11-23T03:12:18.9827934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9828253Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9828597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9828737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9829098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9829206Z output = model(*input) 2022-11-23T03:12:18.9829518Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9829713Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9830078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9830243Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9830590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9830700Z _lazy_init(state, module) 2022-11-23T03:12:18.9831244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9831372Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9831686Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9831795Z return func(*args, **kwargs) 2022-11-23T03:12:18.9832154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9832422Z p_assert( 2022-11-23T03:12:18.9832741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9832855Z traceback.print_stack() 2022-11-23T03:12:18.9832971Z File "", line 1, in 2022-11-23T03:12:18.9833166Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9833296Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9833486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9833625Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9833818Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9833911Z self.run() 2022-11-23T03:12:18.9834100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9834244Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9834572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9834693Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9835041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9835154Z getattr(self, test_name)() 2022-11-23T03:12:18.9835645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9835728Z fn() 2022-11-23T03:12:18.9836068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9836176Z test(self, **param_kwargs) 2022-11-23T03:12:18.9836507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9836670Z return func(*args, **kwargs) 2022-11-23T03:12:18.9836911Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9837009Z self.run_subtests( 2022-11-23T03:12:18.9837331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9837476Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9837815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9837950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9838300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9838403Z output = model(*input) 2022-11-23T03:12:18.9838705Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9839071Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9839427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9839590Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9839943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9840051Z _lazy_init(state, module) 2022-11-23T03:12:18.9840389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9840520Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9840842Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9840959Z return func(*args, **kwargs) 2022-11-23T03:12:18.9841324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9841416Z p_assert( 2022-11-23T03:12:18.9841740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9841853Z traceback.print_stack() 2022-11-23T03:12:18.9842076Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9842299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9842517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9842736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9842847Z File "", line 1, in 2022-11-23T03:12:18.9843046Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9843337Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9843523Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9843658Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9843850Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9844114Z self.run() 2022-11-23T03:12:18.9844299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9844433Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9844762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9844882Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9845228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9845340Z getattr(self, test_name)() 2022-11-23T03:12:18.9845741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9845836Z fn() 2022-11-23T03:12:18.9846184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9846296Z test(self, **param_kwargs) 2022-11-23T03:12:18.9846635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9846747Z return func(*args, **kwargs) 2022-11-23T03:12:18.9846988Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9847090Z self.run_subtests( 2022-11-23T03:12:18.9847582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9847728Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9848116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9848254Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9848606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9848709Z output = model(*input) 2022-11-23T03:12:18.9849007Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9849130Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9849479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9849637Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9849969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9850078Z _lazy_init(state, module) 2022-11-23T03:12:18.9850409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9850537Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9850650Z File "", line 1, in 2022-11-23T03:12:18.9850967Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9851075Z return func(*args, **kwargs) 2022-11-23T03:12:18.9851428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9851509Z p_assert( 2022-11-23T03:12:18.9851700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9851826Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9852314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9852435Z traceback.print_stack() 2022-11-23T03:12:18.9852627Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9852768Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9852969Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9853055Z self.run() 2022-11-23T03:12:18.9853245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9853379Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9853702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9853822Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9854167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9854279Z getattr(self, test_name)() 2022-11-23T03:12:18.9854675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9854763Z fn() 2022-11-23T03:12:18.9855115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9855225Z test(self, **param_kwargs) 2022-11-23T03:12:18.9855728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9855837Z return func(*args, **kwargs) 2022-11-23T03:12:18.9856070Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9856169Z self.run_subtests( 2022-11-23T03:12:18.9856487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9856631Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9857022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9857158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9857506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9857610Z output = model(*input) 2022-11-23T03:12:18.9858092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9858222Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9858577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9858740Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9858859Z File "", line 1, in 2022-11-23T03:12:18.9859210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9859326Z _lazy_init(state, module) 2022-11-23T03:12:18.9859664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9859797Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9859992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9860116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9860441Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9860553Z return func(*args, **kwargs) 2022-11-23T03:12:18.9860909Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9861044Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9861398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9861493Z p_assert( 2022-11-23T03:12:18.9861688Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9861771Z self.run() 2022-11-23T03:12:18.9862084Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9862194Z traceback.print_stack() 2022-11-23T03:12:18.9862379Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9862509Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9862818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9862935Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9863266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9863551Z getattr(self, test_name)() 2022-11-23T03:12:18.9864170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9864279Z fn() 2022-11-23T03:12:18.9864645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9864757Z test(self, **param_kwargs) 2022-11-23T03:12:18.9865100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9865211Z return func(*args, **kwargs) 2022-11-23T03:12:18.9865452Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9865548Z self.run_subtests( 2022-11-23T03:12:18.9865990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9866138Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9866713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9866853Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9867201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9867306Z output = model(*input) 2022-11-23T03:12:18.9867782Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9867905Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9868268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9868431Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9868782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9868897Z _lazy_init(state, module) 2022-11-23T03:12:18.9869237Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9869371Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9869696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9869802Z return func(*args, **kwargs) 2022-11-23T03:12:18.9870166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9870258Z p_assert( 2022-11-23T03:12:18.9870581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9870695Z traceback.print_stack() 2022-11-23T03:12:18.9870812Z File "", line 1, in 2022-11-23T03:12:18.9871010Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9871141Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9871332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9871470Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9871669Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9871761Z self.run() 2022-11-23T03:12:18.9871952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9872086Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9872412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9872527Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9872876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9872990Z getattr(self, test_name)() 2022-11-23T03:12:18.9873386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9873478Z fn() 2022-11-23T03:12:18.9873831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9873941Z test(self, **param_kwargs) 2022-11-23T03:12:18.9874281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9874387Z return func(*args, **kwargs) 2022-11-23T03:12:18.9874628Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9874729Z self.run_subtests( 2022-11-23T03:12:18.9875065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9875263Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9875618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9875759Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9876280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9876379Z output = model(*input) 2022-11-23T03:12:18.9876681Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9876805Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9877155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9877313Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9877654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9877770Z _lazy_init(state, module) 2022-11-23T03:12:18.9878100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9878219Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9878533Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9878642Z return func(*args, **kwargs) 2022-11-23T03:12:18.9878999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9879088Z p_assert( 2022-11-23T03:12:18.9879575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9879691Z traceback.print_stack() 2022-11-23T03:12:18.9879919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9880142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9880363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9880582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9880703Z File "", line 1, in 2022-11-23T03:12:18.9880907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9881035Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9881227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9881364Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9881557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9881650Z self.run() 2022-11-23T03:12:18.9881766Z File "", line 1, in 2022-11-23T03:12:18.9882009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9882148Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9882428Z File "", line 1, in 2022-11-23T03:12:18.9882752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9882863Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9883054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9883180Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9883516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9883625Z getattr(self, test_name)() 2022-11-23T03:12:18.9883812Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9883936Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9884185Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9884494Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9884848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9884934Z fn() 2022-11-23T03:12:18.9885126Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9885262Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9885460Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9885553Z self.run() 2022-11-23T03:12:18.9885899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9886013Z test(self, **param_kwargs) 2022-11-23T03:12:18.9886210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9886306Z self.run() 2022-11-23T03:12:18.9886503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9886636Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9886983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9887095Z return func(*args, **kwargs) 2022-11-23T03:12:18.9887276Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9887407Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9887895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9888011Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9888296Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9888578Z self.run_subtests( 2022-11-23T03:12:18.9888912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9889033Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9889372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9889485Z getattr(self, test_name)() 2022-11-23T03:12:18.9889822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9889973Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9890317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9890428Z getattr(self, test_name)() 2022-11-23T03:12:18.9890780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9890925Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9891317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9891410Z fn() 2022-11-23T03:12:18.9891908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9891990Z fn() 2022-11-23T03:12:18.9892338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9892441Z output = model(*input) 2022-11-23T03:12:18.9892780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9893062Z test(self, **param_kwargs) 2022-11-23T03:12:18.9893404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9893561Z test(self, **param_kwargs) 2022-11-23T03:12:18.9893876Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9894007Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9894351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9894462Z return func(*args, **kwargs) 2022-11-23T03:12:18.9894805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9894921Z return func(*args, **kwargs) 2022-11-23T03:12:18.9895276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9895440Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9895680Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9895786Z self.run_subtests( 2022-11-23T03:12:18.9896029Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9896131Z self.run_subtests( 2022-11-23T03:12:18.9896489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9896599Z _lazy_init(state, module) 2022-11-23T03:12:18.9896932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9897082Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9897577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9897721Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9898050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9898185Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9898528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9898664Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9898990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9899121Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9899431Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9899547Z return func(*args, **kwargs) 2022-11-23T03:12:18.9899896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9900001Z output = model(*input) 2022-11-23T03:12:18.9900573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9900693Z output = model(*input) 2022-11-23T03:12:18.9901055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9901147Z p_assert( 2022-11-23T03:12:18.9901458Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9901589Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9901903Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9902032Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9902355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9902469Z traceback.print_stack() 2022-11-23T03:12:18.9902829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9903056Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9903419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9903582Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9904153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9904273Z _lazy_init(state, module) 2022-11-23T03:12:18.9904629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9904738Z _lazy_init(state, module) 2022-11-23T03:12:18.9905068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9905207Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9905543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9905674Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9905998Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9906111Z return func(*args, **kwargs) 2022-11-23T03:12:18.9906437Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9906549Z return func(*args, **kwargs) 2022-11-23T03:12:18.9906906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9906997Z p_assert( 2022-11-23T03:12:18.9907362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9907453Z p_assert( 2022-11-23T03:12:18.9907777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9907892Z traceback.print_stack() 2022-11-23T03:12:18.9908215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9908329Z traceback.print_stack() 2022-11-23T03:12:18.9908440Z File "", line 1, in 2022-11-23T03:12:18.9908639Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9908768Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9908957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9909095Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9909292Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9909384Z self.run() 2022-11-23T03:12:18.9909639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9909781Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9910109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9910229Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9910575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9910685Z getattr(self, test_name)() 2022-11-23T03:12:18.9911191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9911450Z fn() 2022-11-23T03:12:18.9911795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9911906Z test(self, **param_kwargs) 2022-11-23T03:12:18.9912312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9912429Z return func(*args, **kwargs) 2022-11-23T03:12:18.9912671Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9912772Z self.run_subtests( 2022-11-23T03:12:18.9913107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9913257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9913598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9913739Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9914096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9914203Z output = model(*input) 2022-11-23T03:12:18.9914521Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9914652Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9915014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9915180Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9915523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9915633Z _lazy_init(state, module) 2022-11-23T03:12:18.9915972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9916102Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9916425Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9916542Z return func(*args, **kwargs) 2022-11-23T03:12:18.9917065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9917156Z p_assert( 2022-11-23T03:12:18.9917466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9917579Z traceback.print_stack() 2022-11-23T03:12:18.9917795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9918010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9918221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9918433Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9918547Z File "", line 1, in 2022-11-23T03:12:18.9918737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9918906Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9919094Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9919228Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9919422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9919510Z self.run() 2022-11-23T03:12:18.9919698Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9919826Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9920139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9920259Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9920788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9920983Z getattr(self, test_name)() 2022-11-23T03:12:18.9921333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9921420Z fn() 2022-11-23T03:12:18.9921770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9921880Z test(self, **param_kwargs) 2022-11-23T03:12:18.9922215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9922328Z return func(*args, **kwargs) 2022-11-23T03:12:18.9922569Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9922670Z self.run_subtests( 2022-11-23T03:12:18.9923007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9923162Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9923514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9923655Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9924015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9924124Z output = model(*input) 2022-11-23T03:12:18.9924440Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9924569Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9924933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9925098Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9925450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9925569Z _lazy_init(state, module) 2022-11-23T03:12:18.9925909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9926035Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9926153Z File "", line 1, in 2022-11-23T03:12:18.9926479Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9926591Z return func(*args, **kwargs) 2022-11-23T03:12:18.9926788Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9926918Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9927284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9927369Z p_assert( 2022-11-23T03:12:18.9927564Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9927748Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9928242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9928353Z traceback.print_stack() 2022-11-23T03:12:18.9928724Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9928817Z self.run() 2022-11-23T03:12:18.9929005Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9929133Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9929457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9929578Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9929924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9930085Z getattr(self, test_name)() 2022-11-23T03:12:18.9930438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9930525Z fn() 2022-11-23T03:12:18.9930923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9931031Z test(self, **param_kwargs) 2022-11-23T03:12:18.9931537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9931647Z return func(*args, **kwargs) 2022-11-23T03:12:18.9931879Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9931979Z self.run_subtests( 2022-11-23T03:12:18.9932302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9932449Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9932978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9933115Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9933476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9933588Z output = model(*input) 2022-11-23T03:12:18.9933900Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9934029Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9934390Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9934553Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9934909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9935022Z _lazy_init(state, module) 2022-11-23T03:12:18.9935362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9935493Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9935974Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9936081Z return func(*args, **kwargs) 2022-11-23T03:12:18.9936431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9936519Z p_assert( 2022-11-23T03:12:18.9936831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9936934Z traceback.print_stack() 2022-11-23T03:12:18.9937048Z File "", line 1, in 2022-11-23T03:12:18.9937295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9937428Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9937615Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9937747Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9937938Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9938020Z self.run() 2022-11-23T03:12:18.9938205Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9938334Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9938648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9938765Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9939278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9939437Z getattr(self, test_name)() 2022-11-23T03:12:18.9939786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9939865Z fn() 2022-11-23T03:12:18.9940216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9940327Z test(self, **param_kwargs) 2022-11-23T03:12:18.9940667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9940780Z return func(*args, **kwargs) 2022-11-23T03:12:18.9941019Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9941121Z self.run_subtests( 2022-11-23T03:12:18.9941457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9941604Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9941954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9942095Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9942459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9942566Z output = model(*input) 2022-11-23T03:12:18.9942886Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9943066Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9943592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9943745Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9944292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9944581Z _lazy_init(state, module) 2022-11-23T03:12:18.9944922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9945055Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9945376Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9945488Z return func(*args, **kwargs) 2022-11-23T03:12:18.9945854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9945938Z p_assert( 2022-11-23T03:12:18.9946261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9946375Z traceback.print_stack() 2022-11-23T03:12:18.9946492Z File "", line 1, in 2022-11-23T03:12:18.9946762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9946901Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9947089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9947222Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9947423Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9947515Z self.run() 2022-11-23T03:12:18.9947705Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9947840Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9948171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9948292Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9948639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9948965Z getattr(self, test_name)() 2022-11-23T03:12:18.9949304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9949389Z fn() 2022-11-23T03:12:18.9949726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9949834Z test(self, **param_kwargs) 2022-11-23T03:12:18.9950340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9950453Z return func(*args, **kwargs) 2022-11-23T03:12:18.9950696Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9950791Z self.run_subtests( 2022-11-23T03:12:18.9951129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9951286Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9951636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9951776Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9952136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9952243Z output = model(*input) 2022-11-23T03:12:18.9952555Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9952679Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9953043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9953207Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9953566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9953675Z _lazy_init(state, module) 2022-11-23T03:12:18.9954011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9954144Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9954465Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9954570Z return func(*args, **kwargs) 2022-11-23T03:12:18.9954937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9955028Z p_assert( 2022-11-23T03:12:18.9955351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9955466Z traceback.print_stack() 2022-11-23T03:12:18.9955696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9956130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9956351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9956559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9956673Z File "", line 1, in 2022-11-23T03:12:18.9956784Z File "", line 1, in 2022-11-23T03:12:18.9956975Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9957102Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9957286Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9957422Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9957616Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9957785Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9957983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9958072Z self.run() 2022-11-23T03:12:18.9958253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9958385Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9958568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9958696Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9958884Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9958974Z self.run() 2022-11-23T03:12:18.9959158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9959287Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9959606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9959729Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9960046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9960162Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9960490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9960598Z getattr(self, test_name)() 2022-11-23T03:12:18.9960937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9961046Z getattr(self, test_name)() 2022-11-23T03:12:18.9961375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9961459Z fn() 2022-11-23T03:12:18.9961790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9961876Z fn() 2022-11-23T03:12:18.9962216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9962326Z test(self, **param_kwargs) 2022-11-23T03:12:18.9962659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9962767Z test(self, **param_kwargs) 2022-11-23T03:12:18.9963095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9963205Z return func(*args, **kwargs) 2022-11-23T03:12:18.9963535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9963644Z return func(*args, **kwargs) 2022-11-23T03:12:18.9964048Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9964203Z self.run_subtests( 2022-11-23T03:12:18.9964452Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9964553Z self.run_subtests( 2022-11-23T03:12:18.9964671Z File "", line 1, in 2022-11-23T03:12:18.9965013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9965163Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9965493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9965641Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9965990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9966132Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9966395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9966526Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9967030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9967167Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9967510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9967616Z output = model(*input) 2022-11-23T03:12:18.9967799Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9967933Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9968473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9968591Z output = model(*input) 2022-11-23T03:12:18.9968906Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9969036Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9969232Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9969324Z self.run() 2022-11-23T03:12:18.9969638Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9969766Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9970130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9970296Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9970486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9970621Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9970985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9971149Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9971503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9971612Z _lazy_init(state, module) 2022-11-23T03:12:18.9971967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9972076Z _lazy_init(state, module) 2022-11-23T03:12:18.9972414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9972544Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9972879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9973012Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9973386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9973513Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9980074Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9980232Z return func(*args, **kwargs) 2022-11-23T03:12:18.9980763Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9980874Z return func(*args, **kwargs) 2022-11-23T03:12:18.9981214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9981329Z getattr(self, test_name)() 2022-11-23T03:12:18.9981869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9982063Z p_assert( 2022-11-23T03:12:18.9982441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9982531Z p_assert( 2022-11-23T03:12:18.9982881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9982967Z fn() 2022-11-23T03:12:18.9983285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9983403Z traceback.print_stack() 2022-11-23T03:12:18.9983725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9983837Z traceback.print_stack() 2022-11-23T03:12:18.9984500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9984613Z test(self, **param_kwargs) 2022-11-23T03:12:18.9984969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9985082Z return func(*args, **kwargs) 2022-11-23T03:12:18.9985318Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9985419Z self.run_subtests( 2022-11-23T03:12:18.9985757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9985906Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9986263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9986406Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9986770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9986885Z output = model(*input) 2022-11-23T03:12:18.9987196Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9987326Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9987691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9987855Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9988262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9988374Z _lazy_init(state, module) 2022-11-23T03:12:18.9988714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9988846Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9989164Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9989282Z return func(*args, **kwargs) 2022-11-23T03:12:18.9989735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9989836Z p_assert( 2022-11-23T03:12:18.9990163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9990280Z traceback.print_stack() 2022-11-23T03:12:18.9990401Z File "", line 1, in 2022-11-23T03:12:18.9990598Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:18.9990722Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:18.9990911Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:18.9991049Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:18.9991250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:18.9991406Z self.run() 2022-11-23T03:12:18.9991603Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:18.9991738Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:18.9992219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:18.9992341Z self.run_test(test_name, pipe) 2022-11-23T03:12:18.9992678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:18.9992790Z getattr(self, test_name)() 2022-11-23T03:12:18.9993123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:18.9993207Z fn() 2022-11-23T03:12:18.9993547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:18.9993655Z test(self, **param_kwargs) 2022-11-23T03:12:18.9994179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:18.9994293Z return func(*args, **kwargs) 2022-11-23T03:12:18.9994536Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:18.9994637Z self.run_subtests( 2022-11-23T03:12:18.9994977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:18.9995128Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:18.9995477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:18.9995617Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:18.9995972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:18.9996085Z output = model(*input) 2022-11-23T03:12:18.9996401Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:18.9996531Z return forward_call(*input, **kwargs) 2022-11-23T03:12:18.9996896Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:18.9997062Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:18.9997414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:18.9997524Z _lazy_init(state, module) 2022-11-23T03:12:18.9998013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:18.9998140Z handle.init_flat_param_attributes() 2022-11-23T03:12:18.9998454Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:18.9998622Z return func(*args, **kwargs) 2022-11-23T03:12:18.9998989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:18.9999078Z p_assert( 2022-11-23T03:12:18.9999394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:18.9999505Z traceback.print_stack() 2022-11-23T03:12:18.9999718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:18.9999935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0000149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0000364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0000479Z File "", line 1, in 2022-11-23T03:12:19.0000727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0000856Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0001040Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0001168Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0001363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0001454Z self.run() 2022-11-23T03:12:19.0001636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0001764Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0002265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0002388Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0002738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0002852Z getattr(self, test_name)() 2022-11-23T03:12:19.0003197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0003283Z fn() 2022-11-23T03:12:19.0003636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0003748Z test(self, **param_kwargs) 2022-11-23T03:12:19.0004091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0004204Z return func(*args, **kwargs) 2022-11-23T03:12:19.0004437Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0004542Z self.run_subtests( 2022-11-23T03:12:19.0004880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0005038Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0005557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0005693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0006042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0006145Z output = model(*input) 2022-11-23T03:12:19.0006438Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0006570Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0007108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0007272Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0007679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0007796Z _lazy_init(state, module) 2022-11-23T03:12:19.0008136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0008267Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0008593Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0008700Z return func(*args, **kwargs) 2022-11-23T03:12:19.0009065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0009157Z p_assert( 2022-11-23T03:12:19.0009479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0009593Z traceback.print_stack() 2022-11-23T03:12:19.0009760Z File "", line 1, in 2022-11-23T03:12:19.0009960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0010085Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0010273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0010412Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0010611Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0010706Z self.run() 2022-11-23T03:12:19.0010897Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0011031Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0011361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0011478Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0011827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0011946Z getattr(self, test_name)() 2022-11-23T03:12:19.0012292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0012381Z fn() 2022-11-23T03:12:19.0012734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0012849Z test(self, **param_kwargs) 2022-11-23T03:12:19.0013195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0013303Z return func(*args, **kwargs) 2022-11-23T03:12:19.0013543Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0013646Z self.run_subtests( 2022-11-23T03:12:19.0013985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0014146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0014497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0014640Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0014760Z File "", line 1, in 2022-11-23T03:12:19.0015115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0015222Z output = model(*input) 2022-11-23T03:12:19.0015535Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0015665Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0015863Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0015993Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0016407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0016583Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0016768Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0016907Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0017263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0017373Z _lazy_init(state, module) 2022-11-23T03:12:19.0017575Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0017667Z self.run() 2022-11-23T03:12:19.0018008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0018132Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0018373Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0018512Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0018841Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0018954Z return func(*args, **kwargs) 2022-11-23T03:12:19.0019281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0019402Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0019769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0019853Z p_assert( 2022-11-23T03:12:19.0020197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0020308Z getattr(self, test_name)() 2022-11-23T03:12:19.0020630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0020750Z traceback.print_stack() 2022-11-23T03:12:19.0021100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0021186Z fn() 2022-11-23T03:12:19.0021539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0021643Z test(self, **param_kwargs) 2022-11-23T03:12:19.0021984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0022096Z return func(*args, **kwargs) 2022-11-23T03:12:19.0022342Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0022444Z self.run_subtests( 2022-11-23T03:12:19.0022782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0022939Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0023292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0023426Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0023789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0024127Z output = model(*input) 2022-11-23T03:12:19.0024468Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0024597Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0024957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0025121Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0025552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0025666Z _lazy_init(state, module) 2022-11-23T03:12:19.0026014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0026142Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0026464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0026577Z return func(*args, **kwargs) 2022-11-23T03:12:19.0026940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0027033Z p_assert( 2022-11-23T03:12:19.0027361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0027468Z traceback.print_stack() 2022-11-23T03:12:19.0027648Z File "", line 1, in 2022-11-23T03:12:19.0027851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0027982Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0028172Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0028311Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0028510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0028594Z self.run() 2022-11-23T03:12:19.0028787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0028923Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0029254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0029376Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0029722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0029840Z getattr(self, test_name)() 2022-11-23T03:12:19.0030187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0030267Z fn() 2022-11-23T03:12:19.0030621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0030731Z test(self, **param_kwargs) 2022-11-23T03:12:19.0031127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0031244Z return func(*args, **kwargs) 2022-11-23T03:12:19.0031485Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0031587Z self.run_subtests( 2022-11-23T03:12:19.0031927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0032080Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0032430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0032574Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0032937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0033045Z output = model(*input) 2022-11-23T03:12:19.0033358Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0033489Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0033852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0034012Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0034419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0034534Z _lazy_init(state, module) 2022-11-23T03:12:19.0034872Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0035004Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0035330Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0035442Z return func(*args, **kwargs) 2022-11-23T03:12:19.0035810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0035893Z p_assert( 2022-11-23T03:12:19.0036216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0036394Z traceback.print_stack() 2022-11-23T03:12:19.0036624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0036847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0037069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0037290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0037409Z File "", line 1, in 2022-11-23T03:12:19.0037604Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0037735Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0037927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0038068Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0038267Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0038373Z self.run() 2022-11-23T03:12:19.0038567Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0038704Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0038815Z File "", line 1, in 2022-11-23T03:12:19.0039148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0039270Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0039618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0039730Z getattr(self, test_name)() 2022-11-23T03:12:19.0039929Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0040059Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0040397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0040492Z fn() 2022-11-23T03:12:19.0040685Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0040826Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0041178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0041290Z test(self, **param_kwargs) 2022-11-23T03:12:19.0041489Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0041581Z self.run() 2022-11-23T03:12:19.0041919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0042033Z return func(*args, **kwargs) 2022-11-23T03:12:19.0042224Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0042358Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0042650Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0042756Z self.run_subtests( 2022-11-23T03:12:19.0043087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0043212Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0043544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0043696Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0044044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0044158Z getattr(self, test_name)() 2022-11-23T03:12:19.0044507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0044646Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0045043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0045130Z fn() 2022-11-23T03:12:19.0045484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0045595Z output = model(*input) 2022-11-23T03:12:19.0045950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0046062Z test(self, **param_kwargs) 2022-11-23T03:12:19.0046373Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0046502Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0046847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0046959Z return func(*args, **kwargs) 2022-11-23T03:12:19.0047320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0047485Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0047728Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0047836Z self.run_subtests( 2022-11-23T03:12:19.0048194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0048307Z _lazy_init(state, module) 2022-11-23T03:12:19.0048647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0048799Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0049130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0049269Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0049620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0049763Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0050089Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0050201Z return func(*args, **kwargs) 2022-11-23T03:12:19.0050561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0050671Z output = model(*input) 2022-11-23T03:12:19.0051029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0051124Z p_assert( 2022-11-23T03:12:19.0051437Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0051572Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0051943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0052064Z traceback.print_stack() 2022-11-23T03:12:19.0052426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0052588Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0052933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0053047Z _lazy_init(state, module) 2022-11-23T03:12:19.0053381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0053517Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0053842Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0054007Z return func(*args, **kwargs) 2022-11-23T03:12:19.0054380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0054475Z p_assert( 2022-11-23T03:12:19.0054789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0054911Z traceback.print_stack() 2022-11-23T03:12:19.0055030Z File "", line 1, in 2022-11-23T03:12:19.0055228Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0055358Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0055549Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0055688Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0055881Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0055980Z self.run() 2022-11-23T03:12:19.0056175Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0056309Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0056639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0056762Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0057112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0057223Z getattr(self, test_name)() 2022-11-23T03:12:19.0057561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0057650Z fn() 2022-11-23T03:12:19.0058002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0058118Z test(self, **param_kwargs) 2022-11-23T03:12:19.0058465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0058582Z return func(*args, **kwargs) 2022-11-23T03:12:19.0058822Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0058925Z self.run_subtests( 2022-11-23T03:12:19.0059256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0059406Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0059755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0059896Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0060259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0060373Z output = model(*input) 2022-11-23T03:12:19.0060736Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0060873Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0061231Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0061398Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0061756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0061872Z _lazy_init(state, module) 2022-11-23T03:12:19.0062210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0062343Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0062668Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0062835Z return func(*args, **kwargs) 2022-11-23T03:12:19.0063198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0063289Z p_assert( 2022-11-23T03:12:19.0063611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0063724Z traceback.print_stack() 2022-11-23T03:12:19.0063844Z File "", line 1, in 2022-11-23T03:12:19.0064343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0064474Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0064665Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0064798Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0065003Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0065107Z self.run() 2022-11-23T03:12:19.0065302Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0065440Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0065773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0065892Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0066234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0066347Z getattr(self, test_name)() 2022-11-23T03:12:19.0066695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0066781Z fn() 2022-11-23T03:12:19.0067131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0067246Z test(self, **param_kwargs) 2022-11-23T03:12:19.0067588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0067701Z return func(*args, **kwargs) 2022-11-23T03:12:19.0067938Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0068039Z self.run_subtests( 2022-11-23T03:12:19.0068380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0068535Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0068887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0069029Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0069389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0069622Z output = model(*input) 2022-11-23T03:12:19.0069951Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0070083Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0070450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0070614Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0070969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0071079Z _lazy_init(state, module) 2022-11-23T03:12:19.0071418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0071547Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0071934Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0072053Z return func(*args, **kwargs) 2022-11-23T03:12:19.0072418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0072508Z p_assert( 2022-11-23T03:12:19.0072835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0072949Z traceback.print_stack() 2022-11-23T03:12:19.0073175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0073399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0073613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0073834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0073959Z File "", line 1, in 2022-11-23T03:12:19.0074162Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0074294Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0074484Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0074624Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0074741Z File "", line 1, in 2022-11-23T03:12:19.0074935Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0075030Z self.run() 2022-11-23T03:12:19.0075223Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0075354Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0075549Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0075676Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0076015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0076132Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0076319Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0076458Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0076814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0076927Z getattr(self, test_name)() 2022-11-23T03:12:19.0077128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0077220Z self.run() 2022-11-23T03:12:19.0077570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0077649Z fn() 2022-11-23T03:12:19.0077840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0077981Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0078390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0078517Z test(self, **param_kwargs) 2022-11-23T03:12:19.0078637Z File "", line 1, in 2022-11-23T03:12:19.0078968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0079089Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0079425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0079542Z return func(*args, **kwargs) 2022-11-23T03:12:19.0079887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0080001Z getattr(self, test_name)() 2022-11-23T03:12:19.0080253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0080388Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0080629Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0080733Z self.run_subtests( 2022-11-23T03:12:19.0081079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0081166Z fn() 2022-11-23T03:12:19.0081353Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0081491Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0081835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0081986Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0082348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0082460Z test(self, **param_kwargs) 2022-11-23T03:12:19.0082663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0082760Z self.run() 2022-11-23T03:12:19.0083107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0083250Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0083593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0083707Z return func(*args, **kwargs) 2022-11-23T03:12:19.0083900Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0084027Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0084388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0084500Z output = model(*input) 2022-11-23T03:12:19.0084745Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0084848Z self.run_subtests( 2022-11-23T03:12:19.0085172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0085295Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0085609Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0085731Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0086071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0086221Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0086566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0086730Z getattr(self, test_name)() 2022-11-23T03:12:19.0087103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0087267Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0087617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0087750Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0088094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0088184Z fn() 2022-11-23T03:12:19.0088601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0088713Z _lazy_init(state, module) 2022-11-23T03:12:19.0089077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0089238Z output = model(*input) 2022-11-23T03:12:19.0089593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0089698Z test(self, **param_kwargs) 2022-11-23T03:12:19.0090034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0090166Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0090478Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0090609Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0090956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0091071Z return func(*args, **kwargs) 2022-11-23T03:12:19.0091402Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0091509Z return func(*args, **kwargs) 2022-11-23T03:12:19.0091872Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0092036Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0092279Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0092379Z self.run_subtests( 2022-11-23T03:12:19.0092745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0092838Z p_assert( 2022-11-23T03:12:19.0093189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0093296Z _lazy_init(state, module) 2022-11-23T03:12:19.0093640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0093794Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0094117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0094235Z traceback.print_stack() 2022-11-23T03:12:19.0094571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0094706Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0095058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0095194Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0095523Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0095638Z return func(*args, **kwargs) 2022-11-23T03:12:19.0096048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0096166Z output = model(*input) 2022-11-23T03:12:19.0096535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0096631Z p_assert( 2022-11-23T03:12:19.0096944Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0097066Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0097388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0097503Z traceback.print_stack() 2022-11-23T03:12:19.0097866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0098095Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0098457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0098572Z _lazy_init(state, module) 2022-11-23T03:12:19.0098916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0099040Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0099365Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0099480Z return func(*args, **kwargs) 2022-11-23T03:12:19.0099849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0099940Z p_assert( 2022-11-23T03:12:19.0100266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0100384Z traceback.print_stack() 2022-11-23T03:12:19.0100506Z File "", line 1, in 2022-11-23T03:12:19.0100695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0100826Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0101014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0101154Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0101359Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0101455Z self.run() 2022-11-23T03:12:19.0101648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0101775Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0102103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0102225Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0102583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0102696Z getattr(self, test_name)() 2022-11-23T03:12:19.0103039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0103127Z fn() 2022-11-23T03:12:19.0103480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0103584Z test(self, **param_kwargs) 2022-11-23T03:12:19.0104199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0104325Z return func(*args, **kwargs) 2022-11-23T03:12:19.0104571Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0104674Z self.run_subtests( 2022-11-23T03:12:19.0105094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0105253Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0105607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0105743Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0106106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0106217Z output = model(*input) 2022-11-23T03:12:19.0106531Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0106663Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0107022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0107260Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0107621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0107724Z _lazy_init(state, module) 2022-11-23T03:12:19.0108064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0108197Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0108522Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0108634Z return func(*args, **kwargs) 2022-11-23T03:12:19.0108996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0109090Z p_assert( 2022-11-23T03:12:19.0109412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0109525Z traceback.print_stack() 2022-11-23T03:12:19.0109753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0109978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0110200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0110419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0110536Z File "", line 1, in 2022-11-23T03:12:19.0110736Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0110869Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0111052Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0111191Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0111393Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0111491Z self.run() 2022-11-23T03:12:19.0111688Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0111829Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0112161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0112285Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0112631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0112748Z getattr(self, test_name)() 2022-11-23T03:12:19.0113092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0113187Z fn() 2022-11-23T03:12:19.0113540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0113656Z test(self, **param_kwargs) 2022-11-23T03:12:19.0114046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0114166Z return func(*args, **kwargs) 2022-11-23T03:12:19.0114398Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0114500Z self.run_subtests( 2022-11-23T03:12:19.0114840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0114992Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0115344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0115485Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0115849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0116015Z output = model(*input) 2022-11-23T03:12:19.0116325Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0116455Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0116822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0116987Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0117343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0117453Z _lazy_init(state, module) 2022-11-23T03:12:19.0117791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0117923Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0118037Z File "", line 1, in 2022-11-23T03:12:19.0118368Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0118481Z return func(*args, **kwargs) 2022-11-23T03:12:19.0118680Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0118811Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0119181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0119274Z p_assert( 2022-11-23T03:12:19.0119456Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0119599Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0119924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0120038Z traceback.print_stack() 2022-11-23T03:12:19.0120241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0120337Z self.run() 2022-11-23T03:12:19.0120526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0120662Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0120979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0121102Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0121452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0121564Z getattr(self, test_name)() 2022-11-23T03:12:19.0121909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0121998Z fn() 2022-11-23T03:12:19.0122348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0122463Z test(self, **param_kwargs) 2022-11-23T03:12:19.0122843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0122967Z return func(*args, **kwargs) 2022-11-23T03:12:19.0123210Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0123312Z self.run_subtests( 2022-11-23T03:12:19.0123654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0123805Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0124154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0124297Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0124408Z File "", line 1, in 2022-11-23T03:12:19.0124827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0124938Z output = model(*input) 2022-11-23T03:12:19.0125252Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0125380Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0125584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0125715Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0126075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0126232Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0126424Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0126566Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0126929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0127047Z _lazy_init(state, module) 2022-11-23T03:12:19.0127252Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0127347Z self.run() 2022-11-23T03:12:19.0127683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0127813Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0128003Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0128137Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0128464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0128577Z return func(*args, **kwargs) 2022-11-23T03:12:19.0128905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0129033Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0129396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0129489Z p_assert( 2022-11-23T03:12:19.0129835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0129946Z getattr(self, test_name)() 2022-11-23T03:12:19.0130269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0130382Z traceback.print_stack() 2022-11-23T03:12:19.0130726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0130817Z fn() 2022-11-23T03:12:19.0131209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0131375Z test(self, **param_kwargs) 2022-11-23T03:12:19.0131727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0131840Z return func(*args, **kwargs) 2022-11-23T03:12:19.0132083Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0132188Z self.run_subtests( 2022-11-23T03:12:19.0132524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0132675Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0133020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0133163Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0133524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0133687Z output = model(*input) 2022-11-23T03:12:19.0134005Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0134133Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0134495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0134659Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0135004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0135113Z _lazy_init(state, module) 2022-11-23T03:12:19.0135451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0135584Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0135918Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0136031Z return func(*args, **kwargs) 2022-11-23T03:12:19.0136398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0136489Z p_assert( 2022-11-23T03:12:19.0136807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0136922Z traceback.print_stack() 2022-11-23T03:12:19.0137041Z File "", line 1, in 2022-11-23T03:12:19.0137239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0137371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0137567Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0137707Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0137912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0137999Z self.run() 2022-11-23T03:12:19.0138192Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0138325Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0138652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0138773Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0139123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0139235Z getattr(self, test_name)() 2022-11-23T03:12:19.0139575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0139661Z fn() 2022-11-23T03:12:19.0140015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0140181Z test(self, **param_kwargs) 2022-11-23T03:12:19.0140535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0140648Z return func(*args, **kwargs) 2022-11-23T03:12:19.0140888Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0140990Z self.run_subtests( 2022-11-23T03:12:19.0141319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0141471Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0141826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0141970Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0142387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0142496Z output = model(*input) 2022-11-23T03:12:19.0142808Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0142937Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0143292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0143458Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0143809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0144120Z _lazy_init(state, module) 2022-11-23T03:12:19.0144474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0144607Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0144946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0145062Z return func(*args, **kwargs) 2022-11-23T03:12:19.0145420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0145513Z p_assert( 2022-11-23T03:12:19.0145841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0145956Z traceback.print_stack() 2022-11-23T03:12:19.0146185Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0146409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0146630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0146848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0146966Z File "", line 1, in 2022-11-23T03:12:19.0147168Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0147299Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0147486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0147625Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0147831Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0147923Z self.run() 2022-11-23T03:12:19.0148118Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0148246Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0148580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0148702Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0149123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0149245Z getattr(self, test_name)() 2022-11-23T03:12:19.0149595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0149687Z fn() 2022-11-23T03:12:19.0150042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0150146Z test(self, **param_kwargs) 2022-11-23T03:12:19.0150486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0150601Z return func(*args, **kwargs) 2022-11-23T03:12:19.0150846Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0150951Z self.run_subtests( 2022-11-23T03:12:19.0151359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0151510Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0151865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0151998Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0152362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0152472Z output = model(*input) 2022-11-23T03:12:19.0152783Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0152913Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0153279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0153448Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0153805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0153911Z _lazy_init(state, module) 2022-11-23T03:12:19.0154253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0154384Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0154712Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0154831Z return func(*args, **kwargs) 2022-11-23T03:12:19.0155196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0155294Z p_assert( 2022-11-23T03:12:19.0155415Z File "", line 1, in 2022-11-23T03:12:19.0155605Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0155749Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0155942Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0156083Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0156407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0156523Z traceback.print_stack() 2022-11-23T03:12:19.0156726Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0156811Z self.run() 2022-11-23T03:12:19.0157001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0157138Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0157463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0157588Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0157981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0158101Z getattr(self, test_name)() 2022-11-23T03:12:19.0158219Z File "", line 1, in 2022-11-23T03:12:19.0158557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0158643Z fn() 2022-11-23T03:12:19.0158996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0159106Z test(self, **param_kwargs) 2022-11-23T03:12:19.0159304Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0159434Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0159778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0159959Z return func(*args, **kwargs) 2022-11-23T03:12:19.0160145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0160288Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0160534Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0160637Z self.run_subtests( 2022-11-23T03:12:19.0160838Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0160932Z self.run() 2022-11-23T03:12:19.0161275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0161420Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0161610Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0161743Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0162099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0162241Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0162565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0162688Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0163046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0163148Z output = model(*input) 2022-11-23T03:12:19.0163495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0163609Z getattr(self, test_name)() 2022-11-23T03:12:19.0163926Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0164054Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0164404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0164493Z fn() 2022-11-23T03:12:19.0164854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0165013Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0165366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0165478Z test(self, **param_kwargs) 2022-11-23T03:12:19.0165947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0166061Z _lazy_init(state, module) 2022-11-23T03:12:19.0166408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0166525Z return func(*args, **kwargs) 2022-11-23T03:12:19.0166949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0167079Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0167323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0167425Z self.run_subtests( 2022-11-23T03:12:19.0167753Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0167867Z return func(*args, **kwargs) 2022-11-23T03:12:19.0168208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0168357Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0168724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0168855Z p_assert( 2022-11-23T03:12:19.0169214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0169355Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0169679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0169792Z traceback.print_stack() 2022-11-23T03:12:19.0170148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0170257Z output = model(*input) 2022-11-23T03:12:19.0170569Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0170691Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0171056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0171225Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0171582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0171693Z _lazy_init(state, module) 2022-11-23T03:12:19.0172034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0172167Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0172489Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0172597Z return func(*args, **kwargs) 2022-11-23T03:12:19.0172968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0173060Z p_assert( 2022-11-23T03:12:19.0173384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0173505Z traceback.print_stack() 2022-11-23T03:12:19.0173627Z File "", line 1, in 2022-11-23T03:12:19.0173828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0173959Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0174143Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0174284Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0174484Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0174576Z self.run() 2022-11-23T03:12:19.0174768Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0174902Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0175231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0175349Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0175743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0175864Z getattr(self, test_name)() 2022-11-23T03:12:19.0176213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0176299Z fn() 2022-11-23T03:12:19.0176650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0176765Z test(self, **param_kwargs) 2022-11-23T03:12:19.0177107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0177214Z return func(*args, **kwargs) 2022-11-23T03:12:19.0177455Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0177614Z self.run_subtests( 2022-11-23T03:12:19.0177959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0178114Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0178468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0178608Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0178974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0179075Z output = model(*input) 2022-11-23T03:12:19.0179393Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0179529Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0179892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0180068Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0180424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0180536Z _lazy_init(state, module) 2022-11-23T03:12:19.0180879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0181010Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0181331Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0181444Z return func(*args, **kwargs) 2022-11-23T03:12:19.0181814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0181904Z p_assert( 2022-11-23T03:12:19.0182228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0182352Z traceback.print_stack() 2022-11-23T03:12:19.0182579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0182797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0183019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0183239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0183359Z File "", line 1, in 2022-11-23T03:12:19.0183563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0183694Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0184205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0184366Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0184639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0184746Z self.run() 2022-11-23T03:12:19.0184936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0185072Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0185416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0185538Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0185889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0186001Z getattr(self, test_name)() 2022-11-23T03:12:19.0186339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0186429Z fn() 2022-11-23T03:12:19.0186781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0186960Z test(self, **param_kwargs) 2022-11-23T03:12:19.0187308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0187422Z return func(*args, **kwargs) 2022-11-23T03:12:19.0187665Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0187769Z self.run_subtests( 2022-11-23T03:12:19.0188099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0188250Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0188656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0188797Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0189168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0189278Z output = model(*input) 2022-11-23T03:12:19.0189594Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0189723Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0190081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0190244Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0190599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0190708Z _lazy_init(state, module) 2022-11-23T03:12:19.0191047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0191178Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0191508Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0191626Z return func(*args, **kwargs) 2022-11-23T03:12:19.0191984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0192077Z p_assert( 2022-11-23T03:12:19.0192402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0192524Z traceback.print_stack() 2022-11-23T03:12:19.0192646Z File "", line 1, in 2022-11-23T03:12:19.0192846Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0192977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0193160Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0193302Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0193553Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0193654Z self.run() 2022-11-23T03:12:19.0193770Z File "", line 1, in 2022-11-23T03:12:19.0193967Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0194103Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0194431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0194544Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0194743Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0194873Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0195221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0195332Z getattr(self, test_name)() 2022-11-23T03:12:19.0195572Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0195714Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0196066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0196147Z fn() 2022-11-23T03:12:19.0196350Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0196442Z self.run() 2022-11-23T03:12:19.0196795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0196907Z test(self, **param_kwargs) 2022-11-23T03:12:19.0197100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0197232Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0197571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0197692Z return func(*args, **kwargs) 2022-11-23T03:12:19.0197935Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0198037Z self.run_subtests( 2022-11-23T03:12:19.0198362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0198486Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0198826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0198977Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0199317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0199460Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0199810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0199930Z getattr(self, test_name)() 2022-11-23T03:12:19.0200291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0200397Z output = model(*input) 2022-11-23T03:12:19.0200711Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0200841Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0201177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0201265Z fn() 2022-11-23T03:12:19.0201630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0201793Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0202197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0202315Z test(self, **param_kwargs) 2022-11-23T03:12:19.0202670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0202783Z _lazy_init(state, module) 2022-11-23T03:12:19.0203122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0203235Z return func(*args, **kwargs) 2022-11-23T03:12:19.0203572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0203705Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0203947Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0204052Z self.run_subtests( 2022-11-23T03:12:19.0204465Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0204581Z return func(*args, **kwargs) 2022-11-23T03:12:19.0204914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0205065Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0205429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0205521Z p_assert( 2022-11-23T03:12:19.0205875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0206017Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0206342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0206457Z traceback.print_stack() 2022-11-23T03:12:19.0206818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0206927Z output = model(*input) 2022-11-23T03:12:19.0207245Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0207376Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0207738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0207902Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0208259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0208373Z _lazy_init(state, module) 2022-11-23T03:12:19.0208707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0208848Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0209178Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0209293Z return func(*args, **kwargs) 2022-11-23T03:12:19.0209654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0209744Z p_assert( 2022-11-23T03:12:19.0210070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0210184Z traceback.print_stack() 2022-11-23T03:12:19.0210294Z File "", line 1, in 2022-11-23T03:12:19.0210495Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0210626Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0210815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0210962Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0211212Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0211313Z self.run() 2022-11-23T03:12:19.0211509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0211637Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0211964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0212087Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0212433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0212546Z getattr(self, test_name)() 2022-11-23T03:12:19.0212888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0212976Z fn() 2022-11-23T03:12:19.0213370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0213482Z test(self, **param_kwargs) 2022-11-23T03:12:19.0213829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0213945Z return func(*args, **kwargs) 2022-11-23T03:12:19.0214186Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T03:12:19.0214287Z self.run_subtests( 2022-11-23T03:12:19.0214630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0214783Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0215126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0215268Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0215640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0215747Z output = model(*input) 2022-11-23T03:12:19.0216063Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0216197Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0216565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0216737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0217090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0217193Z _lazy_init(state, module) 2022-11-23T03:12:19.0217532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0217668Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0217999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0218112Z return func(*args, **kwargs) 2022-11-23T03:12:19.0218480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0218572Z p_assert( 2022-11-23T03:12:19.0218887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0219004Z traceback.print_stack() 2022-11-23T03:12:19.0219228Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0219453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0219674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0219941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0220166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0220384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0220599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0220808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0221024Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0221237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0221451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0221667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0221948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0222163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0222377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0222584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0222796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0223011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0223226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0223438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0223651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0224086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0224320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0224528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0224742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0224954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0225165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0225377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0226136Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0226876Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0227611Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0228404Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0229145Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0229867Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0230594Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0231426Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0231651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0231868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0232085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0232296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0232518Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0232740Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0232958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0233171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0233392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0233606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0233820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0234034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0234241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0234459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0234674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0234890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0235103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0235317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0235530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0235744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0235950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0236162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0236427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0236647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0236858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0237070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0237283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0237497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0237703Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0237915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0238128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0238407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0238623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0238838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0239053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0239267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0239478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0239685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0239897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0240114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0240330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0240542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0240755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0240970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0241187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0241391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0241605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0241817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0242036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0242246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0242458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0242673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0242883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0243167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0243386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0243597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0243810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0244080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0244300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0244511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0244723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0244935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0245138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0245351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0245562Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0245776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0246047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0246260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.0246363Z dist init r=3, world=4 2022-11-23T03:12:19.0246687Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0246987Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0247286Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0247580Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0247882Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0248174Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0248462Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0248751Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0249040Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0249335Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0249625Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0249915Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0250018Z dist init r=1, world=4 2022-11-23T03:12:19.0250324Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0250630Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0250977Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0251282Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0251572Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0251865Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0252153Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0252447Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0252781Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0253071Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0253359Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0253650Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0253744Z dist init r=0, world=4 2022-11-23T03:12:19.0254062Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0254371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0254666Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0254960Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0255252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0255544Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0255844Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0256137Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0256427Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0256717Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0257002Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0257339Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0257449Z dist init r=2, world=4 2022-11-23T03:12:19.0257765Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0258070Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0258368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0258663Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0258998Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0259290Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0259579Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0259869Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0260150Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0260443Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0260739Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0261038Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0261129Z ok (6.924s) 2022-11-23T03:12:19.0261498Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26172 2022-11-23T03:12:19.0261712Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26173 2022-11-23T03:12:19.0261916Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26174 2022-11-23T03:12:19.0262120Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26175 2022-11-23T03:12:19.0262500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0262658Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0263032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0263214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0263573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0263735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0264327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0264511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0264949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0265123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0265484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0265663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0266018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0266181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0266544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0266723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0267021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.0267260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.0267483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.0267717Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.0268111Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0268496Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0268868Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0269236Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0269460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.0269678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.0269893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.0270100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.0271110Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0271216Z warnings.warn( 2022-11-23T03:12:19.0272221Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0272322Z warnings.warn( 2022-11-23T03:12:19.0273323Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0273477Z warnings.warn( 2022-11-23T03:12:19.0274481Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0274583Z warnings.warn( 2022-11-23T03:12:19.0275323Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0276116Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0276849Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0277577Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0278300Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0279016Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0279733Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0280468Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0281188Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0281906Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0282670Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0283398Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0284410Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_exec_order_utils.py:239: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:19.0284589Z (rank, world_num_valid_indices[rank]) 2022-11-23T03:12:19.0285610Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_exec_order_utils.py:239: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:19.0285738Z (rank, world_num_valid_indices[rank]) 2022-11-23T03:12:19.0286740Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_exec_order_utils.py:239: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:19.0286872Z (rank, world_num_valid_indices[rank]) 2022-11-23T03:12:19.0287874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_exec_order_utils.py:239: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:12:19.0288001Z (rank, world_num_valid_indices[rank]) 2022-11-23T03:12:19.0288787Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0289521Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0290245Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0291005Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0291110Z dist init r=3, world=4 2022-11-23T03:12:19.0291210Z dist init r=0, world=4 2022-11-23T03:12:19.0291308Z dist init r=2, world=4 2022-11-23T03:12:19.0291404Z dist init r=1, world=4 2022-11-23T03:12:19.0291492Z ok (5.922s) 2022-11-23T03:12:19.0291856Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26473 2022-11-23T03:12:19.0292064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26474 2022-11-23T03:12:19.0292268Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26475 2022-11-23T03:12:19.0292465Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26476 2022-11-23T03:12:19.0292829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0293048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0293421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0293602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0293956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0294119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0294481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0294655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0295008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0295179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0295547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0295727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0296079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0296242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0296610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0296794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0297022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.0297252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.0297493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.0297722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.0298114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0298502Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0298878Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0299256Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0299474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.0299737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.0299963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.0300178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.0301189Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0301291Z warnings.warn( 2022-11-23T03:12:19.0302290Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0302437Z warnings.warn( 2022-11-23T03:12:19.0303435Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0303533Z warnings.warn( 2022-11-23T03:12:19.0304796Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0304901Z warnings.warn( 2022-11-23T03:12:19.0305003Z dist init r=0, world=4 2022-11-23T03:12:19.0305106Z dist init r=1, world=4 2022-11-23T03:12:19.0305194Z dist init r=2, world=4 2022-11-23T03:12:19.0305293Z dist init r=3, world=4 2022-11-23T03:12:19.0305387Z ok (6.022s) 2022-11-23T03:12:19.0305759Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26774 2022-11-23T03:12:19.0305968Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26775 2022-11-23T03:12:19.0306180Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26776 2022-11-23T03:12:19.0306386Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26777 2022-11-23T03:12:19.0306755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0306913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0307283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0307460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0307815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0307982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0308420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0308608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0308965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0309129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0309486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0309664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0310016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0310179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0310543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0310787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0311021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.0311257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.0311478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.0311706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.0312096Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0312478Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0312857Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0313229Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0313449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.0313666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.0313880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.0314084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.0315094Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0315202Z warnings.warn( 2022-11-23T03:12:19.0316196Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0316302Z warnings.warn( 2022-11-23T03:12:19.0317349Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0317457Z warnings.warn( 2022-11-23T03:12:19.0318453Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0318553Z warnings.warn( 2022-11-23T03:12:19.0318653Z dist init r=1, world=4 2022-11-23T03:12:19.0318749Z dist init r=2, world=4 2022-11-23T03:12:19.0318845Z dist init r=0, world=4 2022-11-23T03:12:19.0318989Z dist init r=3, world=4 2022-11-23T03:12:19.0319071Z ok (6.022s) 2022-11-23T03:12:19.0319438Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27075 2022-11-23T03:12:19.0319651Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27076 2022-11-23T03:12:19.0319861Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27077 2022-11-23T03:12:19.0320064Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27078 2022-11-23T03:12:19.0320425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0320590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0320955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0321135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0321491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0321658Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0322022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0322201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0322558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0322724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0323086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0323270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0323618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0323785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0324156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0324336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0324567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.0324799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.0325031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.0325261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.0325690Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0326087Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0326464Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0326839Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0327056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.0327268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.0327482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.0327742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.0328745Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0328850Z warnings.warn( 2022-11-23T03:12:19.0329858Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0329967Z warnings.warn( 2022-11-23T03:12:19.0330960Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0331081Z warnings.warn( 2022-11-23T03:12:19.0332102Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0332205Z warnings.warn( 2022-11-23T03:12:19.0332326Z File "", line 1, in 2022-11-23T03:12:19.0332531Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0332669Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0332860Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0333000Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0333206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0333291Z self.run() 2022-11-23T03:12:19.0333485Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0333620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0333956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0334159Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0334522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0334634Z getattr(self, test_name)() 2022-11-23T03:12:19.0334983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0335064Z fn() 2022-11-23T03:12:19.0335420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0335534Z test(self, **param_kwargs) 2022-11-23T03:12:19.0335880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0335997Z return func(*args, **kwargs) 2022-11-23T03:12:19.0336284Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0336452Z self.run_subtests( 2022-11-23T03:12:19.0336797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0336940Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0337303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0337449Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0337822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0337931Z output = model(*input) 2022-11-23T03:12:19.0338245Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0338377Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0338755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0338917Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0339274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0339387Z _lazy_init(state, module) 2022-11-23T03:12:19.0339730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0339863Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0340189Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0340305Z return func(*args, **kwargs) 2022-11-23T03:12:19.0340671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0340758Z p_assert( 2022-11-23T03:12:19.0341091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0341210Z traceback.print_stack() 2022-11-23T03:12:19.0341334Z File "", line 1, in 2022-11-23T03:12:19.0341538Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0341671Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0341861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0342000Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0342197Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0342290Z self.run() 2022-11-23T03:12:19.0342482Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0342619Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0342997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0343125Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0343475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0343580Z getattr(self, test_name)() 2022-11-23T03:12:19.0344249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0344349Z fn() 2022-11-23T03:12:19.0344714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0344830Z test(self, **param_kwargs) 2022-11-23T03:12:19.0345173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0345345Z return func(*args, **kwargs) 2022-11-23T03:12:19.0345719Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0345814Z self.run_subtests( 2022-11-23T03:12:19.0346158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0346312Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0346662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0346807Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0347173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0347284Z output = model(*input) 2022-11-23T03:12:19.0347594Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0347722Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0348092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0348259Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0348617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0348731Z _lazy_init(state, module) 2022-11-23T03:12:19.0349070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0349204Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0349531Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0349637Z return func(*args, **kwargs) 2022-11-23T03:12:19.0350011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0350109Z p_assert( 2022-11-23T03:12:19.0350433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0350552Z traceback.print_stack() 2022-11-23T03:12:19.0350670Z File "", line 1, in 2022-11-23T03:12:19.0350869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0351005Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0351190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0351332Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0351532Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0351626Z self.run() 2022-11-23T03:12:19.0351819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0351959Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0352348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0352481Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0352825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0352941Z getattr(self, test_name)() 2022-11-23T03:12:19.0353286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0353375Z fn() 2022-11-23T03:12:19.0353726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0353842Z test(self, **param_kwargs) 2022-11-23T03:12:19.0354184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0354337Z return func(*args, **kwargs) 2022-11-23T03:12:19.0354632Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0354741Z self.run_subtests( 2022-11-23T03:12:19.0355082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0355233Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0355589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0355729Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0356093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0356201Z output = model(*input) 2022-11-23T03:12:19.0356509Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0356647Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0357014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0357183Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0357537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0357645Z _lazy_init(state, module) 2022-11-23T03:12:19.0357987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0358118Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0358437Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0358550Z return func(*args, **kwargs) 2022-11-23T03:12:19.0358927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0359020Z p_assert( 2022-11-23T03:12:19.0359347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0359461Z traceback.print_stack() 2022-11-23T03:12:19.0359578Z File "", line 1, in 2022-11-23T03:12:19.0359767Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0359901Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0360094Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0360234Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0360434Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0360527Z self.run() 2022-11-23T03:12:19.0360718Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0360904Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0361235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0361361Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0361710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0361823Z getattr(self, test_name)() 2022-11-23T03:12:19.0362171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0362258Z fn() 2022-11-23T03:12:19.0362611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0362721Z test(self, **param_kwargs) 2022-11-23T03:12:19.0363056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0363220Z return func(*args, **kwargs) 2022-11-23T03:12:19.0363515Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0363620Z self.run_subtests( 2022-11-23T03:12:19.0363964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0364119Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0364473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0364617Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0364973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0365083Z output = model(*input) 2022-11-23T03:12:19.0365403Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0365534Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0365905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0366071Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0366422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0366532Z _lazy_init(state, module) 2022-11-23T03:12:19.0366862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0366993Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0367318Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0367433Z return func(*args, **kwargs) 2022-11-23T03:12:19.0367807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0367899Z p_assert( 2022-11-23T03:12:19.0368225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0368340Z traceback.print_stack() 2022-11-23T03:12:19.0368453Z File "", line 1, in 2022-11-23T03:12:19.0368653Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0368785Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0368977Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0369118Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0369319Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0369413Z self.run() 2022-11-23T03:12:19.0369602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0369787Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0370124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0370249Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0370600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0370717Z getattr(self, test_name)() 2022-11-23T03:12:19.0371063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0371152Z fn() 2022-11-23T03:12:19.0371496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0371609Z test(self, **param_kwargs) 2022-11-23T03:12:19.0371953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0372119Z return func(*args, **kwargs) 2022-11-23T03:12:19.0372410Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0372516Z self.run_subtests( 2022-11-23T03:12:19.0372858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0373010Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0373355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0373499Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0373859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0373972Z output = model(*input) 2022-11-23T03:12:19.0374288Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0374422Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0374785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0374951Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0375296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0375406Z _lazy_init(state, module) 2022-11-23T03:12:19.0375745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0375881Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0376205Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0376324Z return func(*args, **kwargs) 2022-11-23T03:12:19.0376699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0376795Z p_assert( 2022-11-23T03:12:19.0377113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0377232Z traceback.print_stack() 2022-11-23T03:12:19.0377351Z File "", line 1, in 2022-11-23T03:12:19.0377550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0377682Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0377873Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0378014Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0378215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0378304Z self.run() 2022-11-23T03:12:19.0378540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0378683Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0378805Z File "", line 1, in 2022-11-23T03:12:19.0379134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0379256Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0379610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0379713Z getattr(self, test_name)() 2022-11-23T03:12:19.0379916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0380051Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0380399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0380537Z fn() 2022-11-23T03:12:19.0380735Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0380879Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0381236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0381341Z test(self, **param_kwargs) 2022-11-23T03:12:19.0381543Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0381636Z self.run() 2022-11-23T03:12:19.0381985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0382099Z return func(*args, **kwargs) 2022-11-23T03:12:19.0382289Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0382428Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0382719Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0382819Z self.run_subtests( 2022-11-23T03:12:19.0383163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0383316Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0383668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0383812Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0384379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0384509Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0384867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0384975Z output = model(*input) 2022-11-23T03:12:19.0385330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0385448Z getattr(self, test_name)() 2022-11-23T03:12:19.0385768Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0385901Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0386247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0386337Z fn() 2022-11-23T03:12:19.0386706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0386863Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0387217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0387334Z test(self, **param_kwargs) 2022-11-23T03:12:19.0387754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0387879Z _lazy_init(state, module) 2022-11-23T03:12:19.0388231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0388390Z return func(*args, **kwargs) 2022-11-23T03:12:19.0388733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0388858Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0389145Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0389248Z self.run_subtests( 2022-11-23T03:12:19.0389573Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0389753Z return func(*args, **kwargs) 2022-11-23T03:12:19.0390100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0390254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0390622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0390706Z p_assert( 2022-11-23T03:12:19.0391057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0391199Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0391525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0391643Z traceback.print_stack() 2022-11-23T03:12:19.0392006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0392120Z output = model(*input) 2022-11-23T03:12:19.0392435Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0392557Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0392924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0393092Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0393442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0393554Z _lazy_init(state, module) 2022-11-23T03:12:19.0393896Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0394028Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0394363Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0394469Z return func(*args, **kwargs) 2022-11-23T03:12:19.0394834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0394927Z p_assert( 2022-11-23T03:12:19.0395250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0395367Z traceback.print_stack() 2022-11-23T03:12:19.0395490Z File "", line 1, in 2022-11-23T03:12:19.0395690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0395823Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0396007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0396149Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0396358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0396498Z self.run() 2022-11-23T03:12:19.0396700Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0396839Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0397172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0397287Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0397638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0397753Z getattr(self, test_name)() 2022-11-23T03:12:19.0398096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0398184Z fn() 2022-11-23T03:12:19.0398535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0398711Z test(self, **param_kwargs) 2022-11-23T03:12:19.0399062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0399169Z return func(*args, **kwargs) 2022-11-23T03:12:19.0399452Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0399555Z self.run_subtests( 2022-11-23T03:12:19.0399897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0400051Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0400402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0400544Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0400911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0401013Z output = model(*input) 2022-11-23T03:12:19.0401332Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0401462Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0401827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0401992Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0402344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0402456Z _lazy_init(state, module) 2022-11-23T03:12:19.0402795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0402919Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0403257Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0403372Z return func(*args, **kwargs) 2022-11-23T03:12:19.0403748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0403847Z p_assert( 2022-11-23T03:12:19.0404169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0404292Z traceback.print_stack() 2022-11-23T03:12:19.0404412Z File "", line 1, in 2022-11-23T03:12:19.0404604Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0404739Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0404933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0405081Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0405348Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0405459Z self.run() 2022-11-23T03:12:19.0405656Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0405791Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0406111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0406236Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0406586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0406701Z getattr(self, test_name)() 2022-11-23T03:12:19.0407047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0407134Z fn() 2022-11-23T03:12:19.0407485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0407647Z test(self, **param_kwargs) 2022-11-23T03:12:19.0407995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0408109Z return func(*args, **kwargs) 2022-11-23T03:12:19.0408393Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0408497Z self.run_subtests( 2022-11-23T03:12:19.0408837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0408988Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0409112Z File "", line 1, in 2022-11-23T03:12:19.0409457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0409607Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0409976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0410085Z output = model(*input) 2022-11-23T03:12:19.0410292Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0410425Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0410743Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0410872Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0411054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0411197Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0411560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0411735Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0411939Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0412032Z self.run() 2022-11-23T03:12:19.0412388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0412503Z _lazy_init(state, module) 2022-11-23T03:12:19.0412687Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0412828Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0413167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0413303Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0413628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0413752Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0414132Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0414255Z return func(*args, **kwargs) 2022-11-23T03:12:19.0414599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0414717Z getattr(self, test_name)() 2022-11-23T03:12:19.0415085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0415176Z p_assert( 2022-11-23T03:12:19.0415526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0415612Z fn() 2022-11-23T03:12:19.0415937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0416054Z traceback.print_stack() 2022-11-23T03:12:19.0416212Z File "", line 1, in 2022-11-23T03:12:19.0416576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0416691Z test(self, **param_kwargs) 2022-11-23T03:12:19.0417036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0417151Z return func(*args, **kwargs) 2022-11-23T03:12:19.0417352Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0417482Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0417767Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0417863Z self.run_subtests( 2022-11-23T03:12:19.0418057Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0418201Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0418550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0418700Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0418902Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0418996Z self.run() 2022-11-23T03:12:19.0419338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0419483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0419676Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0419811Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0420177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0420290Z output = model(*input) 2022-11-23T03:12:19.0420619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0420740Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0421048Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0421181Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0421532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0421645Z getattr(self, test_name)() 2022-11-23T03:12:19.0422011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0422176Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0422522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0422612Z fn() 2022-11-23T03:12:19.0423002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0423122Z _lazy_init(state, module) 2022-11-23T03:12:19.0423480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0423591Z test(self, **param_kwargs) 2022-11-23T03:12:19.0424200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0424348Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0424697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0424813Z return func(*args, **kwargs) 2022-11-23T03:12:19.0425127Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0425330Z return func(*args, **kwargs) 2022-11-23T03:12:19.0425627Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0425741Z self.run_subtests( 2022-11-23T03:12:19.0426116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0426208Z p_assert( 2022-11-23T03:12:19.0426546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0426698Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0427014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0427134Z traceback.print_stack() 2022-11-23T03:12:19.0427486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0427642Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0428007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0428123Z output = model(*input) 2022-11-23T03:12:19.0428433Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0428567Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0428921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0429089Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0429443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0429555Z _lazy_init(state, module) 2022-11-23T03:12:19.0429902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0430038Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0430366Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0430482Z return func(*args, **kwargs) 2022-11-23T03:12:19.0430839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0430935Z p_assert( 2022-11-23T03:12:19.0431312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0431434Z traceback.print_stack() 2022-11-23T03:12:19.0431559Z File "", line 1, in 2022-11-23T03:12:19.0431759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0431893Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0432090Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0432283Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0432496Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0432591Z self.run() 2022-11-23T03:12:19.0432790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0432929Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0433264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0433391Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0433749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0433854Z getattr(self, test_name)() 2022-11-23T03:12:19.0434210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0434346Z fn() 2022-11-23T03:12:19.0434707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0434821Z test(self, **param_kwargs) 2022-11-23T03:12:19.0435163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0435280Z return func(*args, **kwargs) 2022-11-23T03:12:19.0435567Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0435664Z self.run_subtests( 2022-11-23T03:12:19.0436005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0436156Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0436508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0436660Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0437031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0437144Z output = model(*input) 2022-11-23T03:12:19.0437461Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0437584Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0437952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0438120Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0438477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0438589Z _lazy_init(state, module) 2022-11-23T03:12:19.0438937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0439073Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0439406Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0439512Z return func(*args, **kwargs) 2022-11-23T03:12:19.0439884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0439977Z p_assert( 2022-11-23T03:12:19.0440302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0440423Z traceback.print_stack() 2022-11-23T03:12:19.0440544Z File "", line 1, in 2022-11-23T03:12:19.0440740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0440867Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0441104Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0441259Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0441465Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0441563Z self.run() 2022-11-23T03:12:19.0441758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0441897Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0442228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0442342Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0442696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0442813Z getattr(self, test_name)() 2022-11-23T03:12:19.0443164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0443306Z fn() 2022-11-23T03:12:19.0443669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0443783Z test(self, **param_kwargs) 2022-11-23T03:12:19.0444126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0444232Z return func(*args, **kwargs) 2022-11-23T03:12:19.0444516Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0444625Z self.run_subtests( 2022-11-23T03:12:19.0444966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0445120Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0445483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0445627Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0445996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0446098Z output = model(*input) 2022-11-23T03:12:19.0446417Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0446557Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0446928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0447096Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0447223Z File "", line 1, in 2022-11-23T03:12:19.0447582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0447703Z _lazy_init(state, module) 2022-11-23T03:12:19.0448034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0448175Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0448379Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0448513Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0448842Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0448960Z return func(*args, **kwargs) 2022-11-23T03:12:19.0449153Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0449298Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0449658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0449758Z p_assert( 2022-11-23T03:12:19.0450013Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0450114Z self.run() 2022-11-23T03:12:19.0450446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0450565Z traceback.print_stack() 2022-11-23T03:12:19.0450760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0450888Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0451218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0451341Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0451689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0451805Z getattr(self, test_name)() 2022-11-23T03:12:19.0452217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0452313Z fn() 2022-11-23T03:12:19.0452668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0452773Z test(self, **param_kwargs) 2022-11-23T03:12:19.0453121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0453237Z return func(*args, **kwargs) 2022-11-23T03:12:19.0453522Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0453629Z self.run_subtests( 2022-11-23T03:12:19.0453973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0454128Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0454489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0454625Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0454989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0455101Z output = model(*input) 2022-11-23T03:12:19.0455420Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0455556Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0455924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0456091Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0456450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0456568Z _lazy_init(state, module) 2022-11-23T03:12:19.0456902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0457038Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0457363Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0457479Z return func(*args, **kwargs) 2022-11-23T03:12:19.0457844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0457935Z p_assert( 2022-11-23T03:12:19.0458260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0458366Z traceback.print_stack() 2022-11-23T03:12:19.0458486Z File "", line 1, in 2022-11-23T03:12:19.0458691Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0458869Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0459074Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0459214Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0459417Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0459510Z self.run() 2022-11-23T03:12:19.0459693Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0459831Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0460163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0460287Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0460639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0460816Z getattr(self, test_name)() 2022-11-23T03:12:19.0461172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0461265Z fn() 2022-11-23T03:12:19.0461609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0461721Z test(self, **param_kwargs) 2022-11-23T03:12:19.0462068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0462187Z return func(*args, **kwargs) 2022-11-23T03:12:19.0462476Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0462583Z self.run_subtests( 2022-11-23T03:12:19.0462924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0463079Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0463425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0463568Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0464198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0464324Z output = model(*input) 2022-11-23T03:12:19.0464648Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0464784Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0465154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0465328Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0465678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0465884Z _lazy_init(state, module) 2022-11-23T03:12:19.0466230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0466366Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0466696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0466813Z return func(*args, **kwargs) 2022-11-23T03:12:19.0467180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0467270Z p_assert( 2022-11-23T03:12:19.0467585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0467700Z traceback.print_stack() 2022-11-23T03:12:19.0467824Z File "", line 1, in 2022-11-23T03:12:19.0468108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0468253Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0468448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0468590Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0468785Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0468883Z self.run() 2022-11-23T03:12:19.0469075Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0469210Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0469543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0469667Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0470014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0470229Z getattr(self, test_name)() 2022-11-23T03:12:19.0470570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0470659Z fn() 2022-11-23T03:12:19.0471014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0471127Z test(self, **param_kwargs) 2022-11-23T03:12:19.0471479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0471597Z return func(*args, **kwargs) 2022-11-23T03:12:19.0471886Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0471991Z self.run_subtests( 2022-11-23T03:12:19.0472321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0472480Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0472837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0472984Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0473355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0473471Z output = model(*input) 2022-11-23T03:12:19.0473789Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0473925Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0474280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0474447Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0474809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0474925Z _lazy_init(state, module) 2022-11-23T03:12:19.0475268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0475404Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0475729Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0475842Z return func(*args, **kwargs) 2022-11-23T03:12:19.0476198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0476289Z p_assert( 2022-11-23T03:12:19.0476615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0476730Z traceback.print_stack() 2022-11-23T03:12:19.0476856Z File "", line 1, in 2022-11-23T03:12:19.0477099Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0477237Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0477429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0477562Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0477764Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0477859Z self.run() 2022-11-23T03:12:19.0478058Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0478192Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0478521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0478643Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0478985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0479148Z getattr(self, test_name)() 2022-11-23T03:12:19.0479499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0479587Z fn() 2022-11-23T03:12:19.0479941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0480054Z test(self, **param_kwargs) 2022-11-23T03:12:19.0480399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0480516Z return func(*args, **kwargs) 2022-11-23T03:12:19.0480792Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0480896Z self.run_subtests( 2022-11-23T03:12:19.0481019Z File "", line 1, in 2022-11-23T03:12:19.0481365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0481520Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0481722Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0481858Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0482212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0482346Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0482539Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0482679Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0483046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0483162Z output = model(*input) 2022-11-23T03:12:19.0483369Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0483463Z self.run() 2022-11-23T03:12:19.0483778Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0483901Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0484100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0484238Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0484603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0484772Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0485097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0485222Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0485625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0485734Z _lazy_init(state, module) 2022-11-23T03:12:19.0486091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0486207Z getattr(self, test_name)() 2022-11-23T03:12:19.0486549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0486682Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0487029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0487120Z fn() 2022-11-23T03:12:19.0487446Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0487554Z return func(*args, **kwargs) 2022-11-23T03:12:19.0487960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0488075Z test(self, **param_kwargs) 2022-11-23T03:12:19.0488490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0488584Z p_assert( 2022-11-23T03:12:19.0488931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0489046Z return func(*args, **kwargs) 2022-11-23T03:12:19.0489370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0489477Z traceback.print_stack() 2022-11-23T03:12:19.0489763Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0489873Z self.run_subtests( 2022-11-23T03:12:19.0490215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0490368Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0490724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0490868Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0491233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0491335Z output = model(*input) 2022-11-23T03:12:19.0491651Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0491786Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0492152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0492329Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0492684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0492797Z _lazy_init(state, module) 2022-11-23T03:12:19.0493142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0493267Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0493592Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0493710Z return func(*args, **kwargs) 2022-11-23T03:12:19.0494079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0494169Z p_assert( 2022-11-23T03:12:19.0494492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0494615Z traceback.print_stack() 2022-11-23T03:12:19.0494783Z File "", line 1, in 2022-11-23T03:12:19.0494982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0495119Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0495309Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0495453Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0495657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0495753Z self.run() 2022-11-23T03:12:19.0495951Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0496079Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0496408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0496582Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0496939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0497053Z getattr(self, test_name)() 2022-11-23T03:12:19.0497400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0497489Z fn() 2022-11-23T03:12:19.0497838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0497942Z test(self, **param_kwargs) 2022-11-23T03:12:19.0498286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0498405Z return func(*args, **kwargs) 2022-11-23T03:12:19.0498690Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0498801Z self.run_subtests( 2022-11-23T03:12:19.0499146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0499302Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0499656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0499790Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0500159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0500275Z output = model(*input) 2022-11-23T03:12:19.0500595Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0500729Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0501095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0501270Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0501629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0501732Z _lazy_init(state, module) 2022-11-23T03:12:19.0502071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0502209Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0502536Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0502656Z return func(*args, **kwargs) 2022-11-23T03:12:19.0503023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0503121Z p_assert( 2022-11-23T03:12:19.0503453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0503607Z traceback.print_stack() 2022-11-23T03:12:19.0503736Z File "", line 1, in 2022-11-23T03:12:19.0504190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0504337Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0504532Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0504677Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0504883Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0504983Z self.run() 2022-11-23T03:12:19.0505166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0505305Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0505644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0505850Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0506206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0506327Z getattr(self, test_name)() 2022-11-23T03:12:19.0506674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0506754Z fn() 2022-11-23T03:12:19.0507110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0507223Z test(self, **param_kwargs) 2022-11-23T03:12:19.0507568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0507687Z return func(*args, **kwargs) 2022-11-23T03:12:19.0507980Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0508096Z self.run_subtests( 2022-11-23T03:12:19.0508440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0508583Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0508939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0509085Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0509455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0509564Z output = model(*input) 2022-11-23T03:12:19.0509880Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0510014Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0510387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0510553Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0510900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0511014Z _lazy_init(state, module) 2022-11-23T03:12:19.0511355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0511490Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0511819Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0511939Z return func(*args, **kwargs) 2022-11-23T03:12:19.0512310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0512408Z p_assert( 2022-11-23T03:12:19.0512784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0512914Z traceback.print_stack() 2022-11-23T03:12:19.0513035Z File "", line 1, in 2022-11-23T03:12:19.0513237Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0513372Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0513566Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0513711Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0513904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0514000Z self.run() 2022-11-23T03:12:19.0514199Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0514339Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0514674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0514847Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0515201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0515317Z getattr(self, test_name)() 2022-11-23T03:12:19.0515655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0515746Z fn() 2022-11-23T03:12:19.0516101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0516220Z test(self, **param_kwargs) 2022-11-23T03:12:19.0516565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0516680Z return func(*args, **kwargs) 2022-11-23T03:12:19.0516974Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0517088Z self.run_subtests( 2022-11-23T03:12:19.0517420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0517578Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0517934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0518082Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0518447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0518559Z output = model(*input) 2022-11-23T03:12:19.0519032Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0519161Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0519514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0519678Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0520024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0520135Z _lazy_init(state, module) 2022-11-23T03:12:19.0520466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0520596Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0520916Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0521033Z return func(*args, **kwargs) 2022-11-23T03:12:19.0521381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0521476Z p_assert( 2022-11-23T03:12:19.0521841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0521963Z traceback.print_stack() 2022-11-23T03:12:19.0522083Z File "", line 1, in 2022-11-23T03:12:19.0522281Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0522590Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0522776Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0522921Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0523126Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0523223Z self.run() 2022-11-23T03:12:19.0523415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0523551Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0523959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0524086Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0524428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0524550Z getattr(self, test_name)() 2022-11-23T03:12:19.0524897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0524990Z fn() 2022-11-23T03:12:19.0525343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0525458Z test(self, **param_kwargs) 2022-11-23T03:12:19.0525809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0525928Z return func(*args, **kwargs) 2022-11-23T03:12:19.0526213Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0526319Z self.run_subtests( 2022-11-23T03:12:19.0526996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0527152Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0527506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0527650Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0528017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0528128Z output = model(*input) 2022-11-23T03:12:19.0528434Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0528575Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0528946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0529115Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0529476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0529589Z _lazy_init(state, module) 2022-11-23T03:12:19.0529929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0530066Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0530541Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0530829Z return func(*args, **kwargs) 2022-11-23T03:12:19.0531250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0531399Z p_assert( 2022-11-23T03:12:19.0531743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0531863Z traceback.print_stack() 2022-11-23T03:12:19.0531988Z File "", line 1, in 2022-11-23T03:12:19.0532186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0532309Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0532506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0532650Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0532854Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0532955Z self.run() 2022-11-23T03:12:19.0533151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0533338Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0533819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0533942Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0534287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0534401Z getattr(self, test_name)() 2022-11-23T03:12:19.0534919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0535012Z fn() 2022-11-23T03:12:19.0535369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0535489Z test(self, **param_kwargs) 2022-11-23T03:12:19.0535823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0535948Z return func(*args, **kwargs) 2022-11-23T03:12:19.0536239Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0536347Z self.run_subtests( 2022-11-23T03:12:19.0536692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0536846Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0537203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0537347Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0537881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0537983Z output = model(*input) 2022-11-23T03:12:19.0538289Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0538423Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0538782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0538947Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0539295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0539410Z _lazy_init(state, module) 2022-11-23T03:12:19.0539917Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0540044Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0540380Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0540501Z return func(*args, **kwargs) 2022-11-23T03:12:19.0540922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0541026Z p_assert( 2022-11-23T03:12:19.0541356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0541476Z traceback.print_stack() 2022-11-23T03:12:19.0541586Z File "", line 1, in 2022-11-23T03:12:19.0541789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0541924Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0542118Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0542265Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0542471Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0542572Z self.run() 2022-11-23T03:12:19.0542769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0542948Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0543351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0543479Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0543830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0544148Z getattr(self, test_name)() 2022-11-23T03:12:19.0544511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0544603Z fn() 2022-11-23T03:12:19.0544959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0545064Z test(self, **param_kwargs) 2022-11-23T03:12:19.0545410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0545537Z return func(*args, **kwargs) 2022-11-23T03:12:19.0545828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0545936Z self.run_subtests( 2022-11-23T03:12:19.0546277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0546433Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0546786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0546919Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0547281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0547394Z output = model(*input) 2022-11-23T03:12:19.0547715Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0547855Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0548222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0548390Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0548750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0548853Z _lazy_init(state, module) 2022-11-23T03:12:19.0549195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0549494Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0549814Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0549926Z return func(*args, **kwargs) 2022-11-23T03:12:19.0550361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0550461Z p_assert( 2022-11-23T03:12:19.0550958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0551066Z traceback.print_stack() 2022-11-23T03:12:19.0551190Z File "", line 1, in 2022-11-23T03:12:19.0551310Z File "", line 1, in 2022-11-23T03:12:19.0551514Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0551650Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0551853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0551984Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0552168Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0552379Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0552576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0552723Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0552929Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0553031Z self.run() 2022-11-23T03:12:19.0553231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0553328Z self.run() 2022-11-23T03:12:19.0553511Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0553651Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0554002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0554138Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0554642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0554780Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0555115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0555229Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0555354Z File "", line 1, in 2022-11-23T03:12:19.0555704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0555822Z getattr(self, test_name)() 2022-11-23T03:12:19.0556175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0556292Z getattr(self, test_name)() 2022-11-23T03:12:19.0556494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0556629Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0556967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0557066Z fn() 2022-11-23T03:12:19.0557420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0557507Z fn() 2022-11-23T03:12:19.0557700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0557846Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0558204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0558319Z test(self, **param_kwargs) 2022-11-23T03:12:19.0558659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0558778Z test(self, **param_kwargs) 2022-11-23T03:12:19.0558983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0559085Z self.run() 2022-11-23T03:12:19.0559634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0559757Z return func(*args, **kwargs) 2022-11-23T03:12:19.0560095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0560210Z return func(*args, **kwargs) 2022-11-23T03:12:19.0560389Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0560520Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0560794Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0560902Z self.run_subtests( 2022-11-23T03:12:19.0561181Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0561331Z self.run_subtests( 2022-11-23T03:12:19.0561656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0561780Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0562099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0562249Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0562582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0562733Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0563071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0563183Z getattr(self, test_name)() 2022-11-23T03:12:19.0563523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0563672Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0563999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0564139Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0564474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0564561Z fn() 2022-11-23T03:12:19.0564912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0565022Z output = model(*input) 2022-11-23T03:12:19.0565375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0565484Z output = model(*input) 2022-11-23T03:12:19.0565995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0566116Z test(self, **param_kwargs) 2022-11-23T03:12:19.0566433Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0566569Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0566890Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0567027Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0567382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0567501Z return func(*args, **kwargs) 2022-11-23T03:12:19.0567857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0568028Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0568362Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0568475Z self.run_subtests( 2022-11-23T03:12:19.0568846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0569015Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0569370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0569486Z _lazy_init(state, module) 2022-11-23T03:12:19.0569971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0570299Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0570651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0570813Z _lazy_init(state, module) 2022-11-23T03:12:19.0571160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0571298Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0571655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0571803Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0572130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0572266Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0572594Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0572713Z return func(*args, **kwargs) 2022-11-23T03:12:19.0573086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0573203Z output = model(*input) 2022-11-23T03:12:19.0573535Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0573656Z return func(*args, **kwargs) 2022-11-23T03:12:19.0574014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0574109Z p_assert( 2022-11-23T03:12:19.0574425Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0574562Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0574935Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0575028Z p_assert( 2022-11-23T03:12:19.0575353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0575481Z traceback.print_stack() 2022-11-23T03:12:19.0575836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0576164Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0576476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0576594Z traceback.print_stack() 2022-11-23T03:12:19.0576939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0577052Z _lazy_init(state, module) 2022-11-23T03:12:19.0577384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0577516Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0578057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0578179Z return func(*args, **kwargs) 2022-11-23T03:12:19.0578549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0578643Z p_assert( 2022-11-23T03:12:19.0578963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0579082Z traceback.print_stack() 2022-11-23T03:12:19.0579204Z File "", line 1, in 2022-11-23T03:12:19.0579404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0579531Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0579731Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0579875Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0580141Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0580243Z self.run() 2022-11-23T03:12:19.0580437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0580577Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0580899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0581027Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0581383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0581498Z getattr(self, test_name)() 2022-11-23T03:12:19.0582001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0582091Z fn() 2022-11-23T03:12:19.0582616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0582740Z test(self, **param_kwargs) 2022-11-23T03:12:19.0583081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0583201Z return func(*args, **kwargs) 2022-11-23T03:12:19.0583488Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0583597Z self.run_subtests( 2022-11-23T03:12:19.0584144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0584313Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0584675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0584822Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0585182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0585294Z output = model(*input) 2022-11-23T03:12:19.0585615Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0585750Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0586116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0586285Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0586643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0586757Z _lazy_init(state, module) 2022-11-23T03:12:19.0587087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0587224Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0587633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0587765Z return func(*args, **kwargs) 2022-11-23T03:12:19.0588138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0588242Z p_assert( 2022-11-23T03:12:19.0588617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0588739Z traceback.print_stack() 2022-11-23T03:12:19.0588851Z File "", line 1, in 2022-11-23T03:12:19.0589052Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0589187Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0589382Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0589527Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0589812Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0589909Z self.run() 2022-11-23T03:12:19.0590100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0590229Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0590564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0590692Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0591048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0591166Z getattr(self, test_name)() 2022-11-23T03:12:19.0591517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0591608Z fn() 2022-11-23T03:12:19.0591953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0592079Z test(self, **param_kwargs) 2022-11-23T03:12:19.0592425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0592542Z return func(*args, **kwargs) 2022-11-23T03:12:19.0592830Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0592935Z self.run_subtests( 2022-11-23T03:12:19.0593279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0593439Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0593795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0593930Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0594455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0594565Z output = model(*input) 2022-11-23T03:12:19.0594871Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0595002Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0595356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0595519Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0596045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0596148Z _lazy_init(state, module) 2022-11-23T03:12:19.0596495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0596636Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0597010Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0597134Z return func(*args, **kwargs) 2022-11-23T03:12:19.0597503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0597600Z p_assert( 2022-11-23T03:12:19.0597929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0598038Z traceback.print_stack() 2022-11-23T03:12:19.0598160Z File "", line 1, in 2022-11-23T03:12:19.0598359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0598493Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0598690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0598887Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0599015Z File "", line 1, in 2022-11-23T03:12:19.0599209Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0599308Z self.run() 2022-11-23T03:12:19.0599500Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0599638Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0599997Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0600127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0600451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0600573Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0600747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0600887Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0601238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0601351Z getattr(self, test_name)() 2022-11-23T03:12:19.0601550Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0601644Z self.run() 2022-11-23T03:12:19.0601987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0602064Z fn() 2022-11-23T03:12:19.0602255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0602389Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0602739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0602855Z test(self, **param_kwargs) 2022-11-23T03:12:19.0603168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0603477Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0603829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0603935Z return func(*args, **kwargs) 2022-11-23T03:12:19.0604289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0604406Z getattr(self, test_name)() 2022-11-23T03:12:19.0604697Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0604806Z self.run_subtests( 2022-11-23T03:12:19.0605159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0605252Z fn() 2022-11-23T03:12:19.0605597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0605789Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0606161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0606275Z test(self, **param_kwargs) 2022-11-23T03:12:19.0606621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0606767Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0607115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0607236Z return func(*args, **kwargs) 2022-11-23T03:12:19.0607759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0607857Z output = model(*input) 2022-11-23T03:12:19.0608182Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0608287Z self.run_subtests( 2022-11-23T03:12:19.0608592Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0608898Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0609242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0609395Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0609763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0609920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0610270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0610423Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0610783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0610901Z _lazy_init(state, module) 2022-11-23T03:12:19.0611267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0611378Z output = model(*input) 2022-11-23T03:12:19.0611881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0612001Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0612305Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0612434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0612748Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0612869Z return func(*args, **kwargs) 2022-11-23T03:12:19.0613230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0613393Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0613750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0613842Z p_assert( 2022-11-23T03:12:19.0614173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0614285Z _lazy_init(state, module) 2022-11-23T03:12:19.0614598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0614713Z traceback.print_stack() 2022-11-23T03:12:19.0615045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0615256Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0615585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0615687Z return func(*args, **kwargs) 2022-11-23T03:12:19.0616042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0616136Z p_assert( 2022-11-23T03:12:19.0616454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0616569Z traceback.print_stack() 2022-11-23T03:12:19.0616688Z File "", line 1, in 2022-11-23T03:12:19.0616881Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0617016Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0617241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0617380Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0617580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0617674Z self.run() 2022-11-23T03:12:19.0617866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0617998Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0618329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0618451Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0618779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0618891Z getattr(self, test_name)() 2022-11-23T03:12:19.0619230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0619322Z fn() 2022-11-23T03:12:19.0619670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0619784Z test(self, **param_kwargs) 2022-11-23T03:12:19.0620116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0620219Z return func(*args, **kwargs) 2022-11-23T03:12:19.0620495Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0620599Z self.run_subtests( 2022-11-23T03:12:19.0620931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0621082Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0621426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0621574Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0621930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0622038Z output = model(*input) 2022-11-23T03:12:19.0622333Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0622464Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0623004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0623173Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0623532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0623646Z _lazy_init(state, module) 2022-11-23T03:12:19.0624260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0624414Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0624742Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0624859Z return func(*args, **kwargs) 2022-11-23T03:12:19.0625225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0625322Z p_assert( 2022-11-23T03:12:19.0625648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0625766Z traceback.print_stack() 2022-11-23T03:12:19.0625888Z File "", line 1, in 2022-11-23T03:12:19.0626080Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0626283Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0626484Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0626632Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0626838Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0626936Z self.run() 2022-11-23T03:12:19.0627462Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0627601Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0627925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0628052Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0628403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0628516Z getattr(self, test_name)() 2022-11-23T03:12:19.0628866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0628966Z fn() 2022-11-23T03:12:19.0629323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0629439Z test(self, **param_kwargs) 2022-11-23T03:12:19.0629775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0629893Z return func(*args, **kwargs) 2022-11-23T03:12:19.0630178Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0630286Z self.run_subtests( 2022-11-23T03:12:19.0630795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0631117Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0631285Z File "", line 1, in 2022-11-23T03:12:19.0631651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0631789Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0631991Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0632127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0632491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0632605Z output = model(*input) 2022-11-23T03:12:19.0632804Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0632951Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0633266Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0633393Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0633644Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0633748Z self.run() 2022-11-23T03:12:19.0634266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0634428Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0634617Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0634751Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0635267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0635386Z _lazy_init(state, module) 2022-11-23T03:12:19.0635713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0635839Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0636234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0636371Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0636720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0636836Z getattr(self, test_name)() 2022-11-23T03:12:19.0637152Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0637269Z return func(*args, **kwargs) 2022-11-23T03:12:19.0637621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0637713Z fn() 2022-11-23T03:12:19.0638246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0638337Z p_assert( 2022-11-23T03:12:19.0638684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0638797Z test(self, **param_kwargs) 2022-11-23T03:12:19.0639100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0639219Z traceback.print_stack() 2022-11-23T03:12:19.0639557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0639840Z return func(*args, **kwargs) 2022-11-23T03:12:19.0640130Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0640240Z self.run_subtests( 2022-11-23T03:12:19.0640577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0640733Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0641084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0641231Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0641596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0641708Z output = model(*input) 2022-11-23T03:12:19.0642026Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0642162Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0642532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0642698Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0643047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0643217Z _lazy_init(state, module) 2022-11-23T03:12:19.0643574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0643715Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0644042Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0644162Z return func(*args, **kwargs) 2022-11-23T03:12:19.0644533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0644634Z p_assert( 2022-11-23T03:12:19.0644949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0645067Z traceback.print_stack() 2022-11-23T03:12:19.0645190Z File "", line 1, in 2022-11-23T03:12:19.0645440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0645577Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0645770Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0645911Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0646114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0646199Z self.run() 2022-11-23T03:12:19.0646399Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0646540Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0646876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0647005Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0647357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0647475Z getattr(self, test_name)() 2022-11-23T03:12:19.0647829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0647910Z fn() 2022-11-23T03:12:19.0648270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0648389Z test(self, **param_kwargs) 2022-11-23T03:12:19.0648736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0648857Z return func(*args, **kwargs) 2022-11-23T03:12:19.0649144Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0649252Z self.run_subtests( 2022-11-23T03:12:19.0649760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0649904Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0650253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0650399Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0650755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0651031Z output = model(*input) 2022-11-23T03:12:19.0651351Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0651484Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0651853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0652010Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0652415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0652538Z _lazy_init(state, module) 2022-11-23T03:12:19.0652885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0653019Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0653343Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0653461Z return func(*args, **kwargs) 2022-11-23T03:12:19.0653836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0653921Z p_assert( 2022-11-23T03:12:19.0654249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0654368Z traceback.print_stack() 2022-11-23T03:12:19.0654492Z File "", line 1, in 2022-11-23T03:12:19.0654760Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0654895Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0655092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0655226Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0655430Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0655527Z self.run() 2022-11-23T03:12:19.0655727Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0655864Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0656196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0656322Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0656679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0656788Z getattr(self, test_name)() 2022-11-23T03:12:19.0657144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0657236Z fn() 2022-11-23T03:12:19.0657593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0657710Z test(self, **param_kwargs) 2022-11-23T03:12:19.0658059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0658178Z return func(*args, **kwargs) 2022-11-23T03:12:19.0658465Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0658562Z self.run_subtests( 2022-11-23T03:12:19.0658908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0659073Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0659586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0659725Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0660081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0660193Z output = model(*input) 2022-11-23T03:12:19.0660497Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0660615Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0660973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0661133Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0661528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0661643Z _lazy_init(state, module) 2022-11-23T03:12:19.0661973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0662103Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0662426Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0662529Z return func(*args, **kwargs) 2022-11-23T03:12:19.0662890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0662984Z p_assert( 2022-11-23T03:12:19.0663299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0663415Z traceback.print_stack() 2022-11-23T03:12:19.0663580Z File "", line 1, in 2022-11-23T03:12:19.0663778Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0664290Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0664486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0664632Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0664838Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0664939Z self.run() 2022-11-23T03:12:19.0665135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0665271Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0665612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0665727Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0666079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0666204Z getattr(self, test_name)() 2022-11-23T03:12:19.0666554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0666650Z fn() 2022-11-23T03:12:19.0667005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0667122Z test(self, **param_kwargs) 2022-11-23T03:12:19.0667465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0667571Z return func(*args, **kwargs) 2022-11-23T03:12:19.0667861Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0667969Z self.run_subtests( 2022-11-23T03:12:19.0668324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0668477Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0668830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0668973Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0669341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0669441Z output = model(*input) 2022-11-23T03:12:19.0669761Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0669892Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0670261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0670768Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0671192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0671317Z _lazy_init(state, module) 2022-11-23T03:12:19.0671668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0671808Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0672127Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0672246Z return func(*args, **kwargs) 2022-11-23T03:12:19.0672615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0672714Z p_assert( 2022-11-23T03:12:19.0673038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0673229Z traceback.print_stack() 2022-11-23T03:12:19.0673359Z File "", line 1, in 2022-11-23T03:12:19.0673551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0673690Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0673884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0674029Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0674235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0674334Z self.run() 2022-11-23T03:12:19.0674529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0674665Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0674987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0675110Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0675475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0675590Z getattr(self, test_name)() 2022-11-23T03:12:19.0675939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0676031Z fn() 2022-11-23T03:12:19.0676548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0676660Z test(self, **param_kwargs) 2022-11-23T03:12:19.0676983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0677270Z return func(*args, **kwargs) 2022-11-23T03:12:19.0677563Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0677674Z self.run_subtests( 2022-11-23T03:12:19.0678023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0678178Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0678539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0678683Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0679038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0679153Z output = model(*input) 2022-11-23T03:12:19.0679467Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0679600Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0679967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0680348Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0680711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0680822Z _lazy_init(state, module) 2022-11-23T03:12:19.0681141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0681274Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0681592Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0681706Z return func(*args, **kwargs) 2022-11-23T03:12:19.0682060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0682155Z p_assert( 2022-11-23T03:12:19.0682475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0682637Z traceback.print_stack() 2022-11-23T03:12:19.0682747Z File "", line 1, in 2022-11-23T03:12:19.0682943Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0683075Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0683265Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0683403Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0683603Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0683697Z self.run() 2022-11-23T03:12:19.0683875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0684189Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0684521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0684651Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0685006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0685123Z getattr(self, test_name)() 2022-11-23T03:12:19.0685468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0685557Z fn() 2022-11-23T03:12:19.0685900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0686017Z test(self, **param_kwargs) 2022-11-23T03:12:19.0686365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0686480Z return func(*args, **kwargs) 2022-11-23T03:12:19.0686773Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0686887Z self.run_subtests( 2022-11-23T03:12:19.0687232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0687388Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0687734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0687879Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0688246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0688358Z output = model(*input) 2022-11-23T03:12:19.0688734Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0688869Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0689238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0689460Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0689817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0689931Z _lazy_init(state, module) 2022-11-23T03:12:19.0690275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0690412Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0690891Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0691001Z return func(*args, **kwargs) 2022-11-23T03:12:19.0691538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0691636Z p_assert( 2022-11-23T03:12:19.0691954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0692125Z traceback.print_stack() 2022-11-23T03:12:19.0692251Z File "", line 1, in 2022-11-23T03:12:19.0692453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0692589Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0692783Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0692926Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0693133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0693219Z self.run() 2022-11-23T03:12:19.0693415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0693550Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0693879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0694011Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0694526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0694641Z getattr(self, test_name)() 2022-11-23T03:12:19.0694967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0695055Z fn() 2022-11-23T03:12:19.0695398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0695509Z test(self, **param_kwargs) 2022-11-23T03:12:19.0695842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0695955Z return func(*args, **kwargs) 2022-11-23T03:12:19.0696414Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0696531Z self.run_subtests( 2022-11-23T03:12:19.0696863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0697019Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0697373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0697519Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0697888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0698004Z output = model(*input) 2022-11-23T03:12:19.0698317Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0698454Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0698862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0699038Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0699398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0699511Z _lazy_init(state, module) 2022-11-23T03:12:19.0699855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0699991Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0700468Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0700581Z return func(*args, **kwargs) 2022-11-23T03:12:19.0700936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0701064Z p_assert( 2022-11-23T03:12:19.0701388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0701507Z traceback.print_stack() 2022-11-23T03:12:19.0701626Z File "", line 1, in 2022-11-23T03:12:19.0701744Z File "", line 1, in 2022-11-23T03:12:19.0701938Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0702065Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0702243Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0702379Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0702573Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0702700Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0702899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0702996Z self.run() 2022-11-23T03:12:19.0703188Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0703316Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0703510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0703604Z self.run() 2022-11-23T03:12:19.0703794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0704306Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0704657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0704788Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0705141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0705245Z getattr(self, test_name)() 2022-11-23T03:12:19.0705439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0705587Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0705935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0706024Z fn() 2022-11-23T03:12:19.0706354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0706480Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0706834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0706937Z test(self, **param_kwargs) 2022-11-23T03:12:19.0707283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0707395Z getattr(self, test_name)() 2022-11-23T03:12:19.0707742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0708027Z return func(*args, **kwargs) 2022-11-23T03:12:19.0708437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0708539Z fn() 2022-11-23T03:12:19.0708815Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0708908Z self.run_subtests( 2022-11-23T03:12:19.0709436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0709556Z test(self, **param_kwargs) 2022-11-23T03:12:19.0709896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0710050Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0710397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0710577Z return func(*args, **kwargs) 2022-11-23T03:12:19.0710931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0711063Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0711356Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0711467Z self.run_subtests( 2022-11-23T03:12:19.0711837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0711947Z output = model(*input) 2022-11-23T03:12:19.0712446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0712596Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0712910Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0713029Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0713144Z File "", line 1, in 2022-11-23T03:12:19.0713481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0713623Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0713977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0714142Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0714498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0714607Z output = model(*input) 2022-11-23T03:12:19.0714791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0714931Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0715282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0715392Z _lazy_init(state, module) 2022-11-23T03:12:19.0715697Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0715821Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0716008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0716148Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0716467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0716603Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0716952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0717160Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0717367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0717461Z self.run() 2022-11-23T03:12:19.0717780Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0717893Z return func(*args, **kwargs) 2022-11-23T03:12:19.0718226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0718338Z _lazy_init(state, module) 2022-11-23T03:12:19.0718527Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0718660Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0719016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0719167Z p_assert( 2022-11-23T03:12:19.0719501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0719631Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0719938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0720059Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0720377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0720495Z traceback.print_stack() 2022-11-23T03:12:19.0720815Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0720929Z return func(*args, **kwargs) 2022-11-23T03:12:19.0721270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0721567Z getattr(self, test_name)() 2022-11-23T03:12:19.0721932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0722027Z p_assert( 2022-11-23T03:12:19.0722373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0722465Z fn() 2022-11-23T03:12:19.0722792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0722910Z traceback.print_stack() 2022-11-23T03:12:19.0723260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0723365Z test(self, **param_kwargs) 2022-11-23T03:12:19.0723711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0723831Z return func(*args, **kwargs) 2022-11-23T03:12:19.0724122Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0724227Z self.run_subtests( 2022-11-23T03:12:19.0724573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0724728Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0725079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0725215Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0725579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0725691Z output = model(*input) 2022-11-23T03:12:19.0726169Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0726350Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0726717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0726879Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0727230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0727343Z _lazy_init(state, module) 2022-11-23T03:12:19.0727838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0727974Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0728304Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0728418Z return func(*args, **kwargs) 2022-11-23T03:12:19.0728843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0728942Z p_assert( 2022-11-23T03:12:19.0729265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0729373Z traceback.print_stack() 2022-11-23T03:12:19.0729495Z File "", line 1, in 2022-11-23T03:12:19.0729696Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0729834Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0730028Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0730171Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0730375Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0730472Z self.run() 2022-11-23T03:12:19.0730657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0730802Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0731507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0731638Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0731994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0732110Z getattr(self, test_name)() 2022-11-23T03:12:19.0732460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0732552Z fn() 2022-11-23T03:12:19.0732895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0733013Z test(self, **param_kwargs) 2022-11-23T03:12:19.0733361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0733483Z return func(*args, **kwargs) 2022-11-23T03:12:19.0733775Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0733883Z self.run_subtests( 2022-11-23T03:12:19.0734386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0734536Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0734868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0735008Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0735361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0735651Z output = model(*input) 2022-11-23T03:12:19.0735972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0736153Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0736532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0736701Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0737047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0737161Z _lazy_init(state, module) 2022-11-23T03:12:19.0737504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0737637Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0737965Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0738077Z return func(*args, **kwargs) 2022-11-23T03:12:19.0738658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0738753Z p_assert( 2022-11-23T03:12:19.0739057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0739176Z traceback.print_stack() 2022-11-23T03:12:19.0739292Z File "", line 1, in 2022-11-23T03:12:19.0739489Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0739618Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0739974Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0740117Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0740311Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0740408Z self.run() 2022-11-23T03:12:19.0740606Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0740748Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0741079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0741203Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0741556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0741673Z getattr(self, test_name)() 2022-11-23T03:12:19.0742010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0742104Z fn() 2022-11-23T03:12:19.0742460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0742573Z test(self, **param_kwargs) 2022-11-23T03:12:19.0743253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0743378Z return func(*args, **kwargs) 2022-11-23T03:12:19.0743666Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0743771Z self.run_subtests( 2022-11-23T03:12:19.0744310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0744468Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0744824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0744973Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0745339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0745454Z output = model(*input) 2022-11-23T03:12:19.0745998Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0746137Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0746671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0746843Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0747198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0747315Z _lazy_init(state, module) 2022-11-23T03:12:19.0747657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0747792Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0748118Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0754995Z return func(*args, **kwargs) 2022-11-23T03:12:19.0755628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0755734Z p_assert( 2022-11-23T03:12:19.0756076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0756199Z traceback.print_stack() 2022-11-23T03:12:19.0756327Z File "", line 1, in 2022-11-23T03:12:19.0756529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0756666Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0756866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0756999Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0757202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0757307Z self.run() 2022-11-23T03:12:19.0757509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0757652Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0757993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0758116Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0758460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0758735Z getattr(self, test_name)() 2022-11-23T03:12:19.0759260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0759351Z fn() 2022-11-23T03:12:19.0759712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0759830Z test(self, **param_kwargs) 2022-11-23T03:12:19.0760185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0760302Z return func(*args, **kwargs) 2022-11-23T03:12:19.0760579Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0760691Z self.run_subtests( 2022-11-23T03:12:19.0761029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0761182Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0761538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0761688Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0762207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0762319Z output = model(*input) 2022-11-23T03:12:19.0762702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0762845Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0763203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0763367Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0763711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0763819Z _lazy_init(state, module) 2022-11-23T03:12:19.0764146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0764278Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0764592Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0764787Z return func(*args, **kwargs) 2022-11-23T03:12:19.0765153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0765253Z p_assert( 2022-11-23T03:12:19.0765570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0765686Z traceback.print_stack() 2022-11-23T03:12:19.0765924Z File "", line 1, in 2022-11-23T03:12:19.0766125Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0766245Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0766442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0766753Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0766957Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0767063Z self.run() 2022-11-23T03:12:19.0767262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0767404Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0767745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0767860Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0768213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0768330Z getattr(self, test_name)() 2022-11-23T03:12:19.0768682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0768774Z fn() 2022-11-23T03:12:19.0769128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0769250Z test(self, **param_kwargs) 2022-11-23T03:12:19.0769764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0769870Z return func(*args, **kwargs) 2022-11-23T03:12:19.0770151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0770256Z self.run_subtests( 2022-11-23T03:12:19.0770587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0770738Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0771264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0771410Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0771779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0771952Z output = model(*input) 2022-11-23T03:12:19.0772287Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0772422Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0772792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0772961Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0773317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0773431Z _lazy_init(state, module) 2022-11-23T03:12:19.0773774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0773896Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0774278Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0774401Z return func(*args, **kwargs) 2022-11-23T03:12:19.0774769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0774862Z p_assert( 2022-11-23T03:12:19.0775193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0775311Z traceback.print_stack() 2022-11-23T03:12:19.0775436Z File "", line 1, in 2022-11-23T03:12:19.0775628Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0775763Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0775960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0776103Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0776316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0776418Z self.run() 2022-11-23T03:12:19.0776616Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0776743Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0777238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0777357Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0777695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0777808Z getattr(self, test_name)() 2022-11-23T03:12:19.0778143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0778230Z fn() 2022-11-23T03:12:19.0778575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0778682Z test(self, **param_kwargs) 2022-11-23T03:12:19.0779025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0779140Z return func(*args, **kwargs) 2022-11-23T03:12:19.0779418Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0779523Z self.run_subtests( 2022-11-23T03:12:19.0779856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0780005Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0780346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0780654Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0781068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0781192Z output = model(*input) 2022-11-23T03:12:19.0781509Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0781645Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0782009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0782178Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0782530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0782633Z _lazy_init(state, module) 2022-11-23T03:12:19.0782971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0783108Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0783649Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0783763Z return func(*args, **kwargs) 2022-11-23T03:12:19.0784585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0784686Z p_assert( 2022-11-23T03:12:19.0785020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0785127Z traceback.print_stack() 2022-11-23T03:12:19.0785872Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0786619Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0787676Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0788406Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0789189Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0789921Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0790647Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0791597Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0792513Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0793231Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0793952Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0794751Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0795472Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0796193Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0796917Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0797638Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0798361Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0799080Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0799800Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0800718Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.0800835Z dist init r=3, world=4 2022-11-23T03:12:19.0801148Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0801624Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0801927Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0802223Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0802592Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0802900Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0803189Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0803484Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0803779Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0804079Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0804536Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0804999Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.0805101Z dist init r=0, world=4 2022-11-23T03:12:19.0805401Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0805701Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0805997Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0806296Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0806589Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0806872Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0807165Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0807456Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0807795Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0808095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0808385Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0808673Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.0808775Z dist init r=2, world=4 2022-11-23T03:12:19.0809091Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0809442Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0809739Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0810027Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0810324Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0810619Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0810917Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0811210Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0811501Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0811796Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0812089Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0812386Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.0812494Z dist init r=1, world=4 2022-11-23T03:12:19.0812809Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0813107Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0813409Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0813708Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0814048Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0814353Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0814642Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0814929Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0815219Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0815512Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0815854Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0816145Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.0816241Z ok (6.122s) 2022-11-23T03:12:19.0816592Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27376 2022-11-23T03:12:19.0816806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27377 2022-11-23T03:12:19.0817012Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27378 2022-11-23T03:12:19.0817224Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27379 2022-11-23T03:12:19.0817602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0817933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0818294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0818475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0818820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0818972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0819326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0819500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0819856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0820026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0820381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0820556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0820903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.0821053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.0821406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.0821583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.0821810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.0822093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.0822327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.0822550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.0822928Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0823305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0824056Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0824461Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.0824766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.0824989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.0825201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.0825416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.0826428Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0826538Z warnings.warn( 2022-11-23T03:12:19.0827695Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0827795Z warnings.warn( 2022-11-23T03:12:19.0828944Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0829051Z warnings.warn( 2022-11-23T03:12:19.0830053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.0830154Z warnings.warn( 2022-11-23T03:12:19.0830265Z File "", line 1, in 2022-11-23T03:12:19.0830470Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0830606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0830800Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0830943Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0831210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0831513Z self.run() 2022-11-23T03:12:19.0831707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0832019Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0832138Z File "", line 1, in 2022-11-23T03:12:19.0832478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0832606Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0832809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0832944Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0833303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0833410Z getattr(self, test_name)() 2022-11-23T03:12:19.0833660Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0833805Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0834162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0834252Z fn() 2022-11-23T03:12:19.0834453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0834546Z self.run() 2022-11-23T03:12:19.0834899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0835005Z test(self, **param_kwargs) 2022-11-23T03:12:19.0835201Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0835340Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0835693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0835816Z return func(*args, **kwargs) 2022-11-23T03:12:19.0836144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0836269Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0836547Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0836653Z self.run_subtests( 2022-11-23T03:12:19.0837004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0837119Z getattr(self, test_name)() 2022-11-23T03:12:19.0837459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0837613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0837958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0838053Z fn() 2022-11-23T03:12:19.0838399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0838545Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0839056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0839168Z test(self, **param_kwargs) 2022-11-23T03:12:19.0839524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0839631Z output = model(*input) 2022-11-23T03:12:19.0839964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0840250Z return func(*args, **kwargs) 2022-11-23T03:12:19.0840561Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0840740Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0841037Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0841145Z self.run_subtests( 2022-11-23T03:12:19.0841520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0841688Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0842035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0842189Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0842544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0842694Z _lazy_init(state, module) 2022-11-23T03:12:19.0843283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0843601Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0843949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0844087Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0844456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0844567Z output = model(*input) 2022-11-23T03:12:19.0844898Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0845005Z return func(*args, **kwargs) 2022-11-23T03:12:19.0845320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0845460Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0845829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0845927Z p_assert( 2022-11-23T03:12:19.0846292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0846455Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0846778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0846886Z traceback.print_stack() 2022-11-23T03:12:19.0847246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0847361Z _lazy_init(state, module) 2022-11-23T03:12:19.0847703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0847845Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0848177Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0848293Z return func(*args, **kwargs) 2022-11-23T03:12:19.0848662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0848747Z p_assert( 2022-11-23T03:12:19.0849069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0849188Z traceback.print_stack() 2022-11-23T03:12:19.0849311Z File "", line 1, in 2022-11-23T03:12:19.0849517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0849653Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0849847Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0850034Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0850247Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0850505Z self.run() 2022-11-23T03:12:19.0850697Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0850829Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0851323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0851450Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0851800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0851905Z getattr(self, test_name)() 2022-11-23T03:12:19.0852255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0852408Z fn() 2022-11-23T03:12:19.0852771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0852887Z test(self, **param_kwargs) 2022-11-23T03:12:19.0853238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0853360Z return func(*args, **kwargs) 2022-11-23T03:12:19.0853647Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0853743Z self.run_subtests( 2022-11-23T03:12:19.0854087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0854242Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0854749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0854898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0855257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0855371Z output = model(*input) 2022-11-23T03:12:19.0855861Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0855985Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0856355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0856524Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0856879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0856991Z _lazy_init(state, module) 2022-11-23T03:12:19.0857331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0857472Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0857802Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0857907Z return func(*args, **kwargs) 2022-11-23T03:12:19.0858279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0858376Z p_assert( 2022-11-23T03:12:19.0858704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0858826Z traceback.print_stack() 2022-11-23T03:12:19.0859109Z File "", line 1, in 2022-11-23T03:12:19.0859300Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0859432Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0859615Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0859803Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0860007Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0860099Z self.run() 2022-11-23T03:12:19.0860285Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0860417Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0860737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0860848Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0861187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0861296Z getattr(self, test_name)() 2022-11-23T03:12:19.0861634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0861766Z fn() 2022-11-23T03:12:19.0862115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0862226Z test(self, **param_kwargs) 2022-11-23T03:12:19.0862558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0862661Z return func(*args, **kwargs) 2022-11-23T03:12:19.0862936Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0863039Z self.run_subtests( 2022-11-23T03:12:19.0863372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0863517Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0864246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0864416Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0864788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0864890Z output = model(*input) 2022-11-23T03:12:19.0865205Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0865336Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0865702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0865867Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0866218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0866331Z _lazy_init(state, module) 2022-11-23T03:12:19.0866676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0866800Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0867130Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0867249Z return func(*args, **kwargs) 2022-11-23T03:12:19.0867617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0867710Z p_assert( 2022-11-23T03:12:19.0868038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0868155Z traceback.print_stack() 2022-11-23T03:12:19.0868276Z File "", line 1, in 2022-11-23T03:12:19.0868468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0868602Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0868868Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0869019Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0869221Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0869316Z self.run() 2022-11-23T03:12:19.0869509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0869645Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0870129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0870250Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0870592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0870703Z getattr(self, test_name)() 2022-11-23T03:12:19.0871041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0871197Z fn() 2022-11-23T03:12:19.0871729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0871833Z test(self, **param_kwargs) 2022-11-23T03:12:19.0872182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0872295Z return func(*args, **kwargs) 2022-11-23T03:12:19.0872583Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0872689Z self.run_subtests( 2022-11-23T03:12:19.0873029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0873181Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0873541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0873688Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0874045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0874155Z output = model(*input) 2022-11-23T03:12:19.0874472Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0874607Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0874975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0875143Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0875498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0875616Z _lazy_init(state, module) 2022-11-23T03:12:19.0875732Z File "", line 1, in 2022-11-23T03:12:19.0876079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0876218Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0876551Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0876672Z return func(*args, **kwargs) 2022-11-23T03:12:19.0876872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0877165Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0877519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0877601Z p_assert( 2022-11-23T03:12:19.0877789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0877927Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0878286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0878585Z traceback.print_stack() 2022-11-23T03:12:19.0878790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0878882Z self.run() 2022-11-23T03:12:19.0879066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0879204Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0879530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0879652Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0880004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0880118Z getattr(self, test_name)() 2022-11-23T03:12:19.0880518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0880615Z fn() 2022-11-23T03:12:19.0880957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0881075Z test(self, **param_kwargs) 2022-11-23T03:12:19.0881577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0881689Z return func(*args, **kwargs) 2022-11-23T03:12:19.0881969Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0882071Z self.run_subtests( 2022-11-23T03:12:19.0882402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0882729Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0883082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0883227Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0883592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0883702Z output = model(*input) 2022-11-23T03:12:19.0884021Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0884153Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0884519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0884684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0885030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0885151Z _lazy_init(state, module) 2022-11-23T03:12:19.0885646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0885774Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0886087Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0886199Z return func(*args, **kwargs) 2022-11-23T03:12:19.0886552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0886643Z p_assert( 2022-11-23T03:12:19.0886948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0887063Z traceback.print_stack() 2022-11-23T03:12:19.0887180Z File "", line 1, in 2022-11-23T03:12:19.0887373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0887556Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0887747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0887885Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0888265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0888351Z self.run() 2022-11-23T03:12:19.0888597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0888737Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0889073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0889196Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0889547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0889715Z getattr(self, test_name)() 2022-11-23T03:12:19.0890055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0890145Z fn() 2022-11-23T03:12:19.0890502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0890617Z test(self, **param_kwargs) 2022-11-23T03:12:19.0890962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0891077Z return func(*args, **kwargs) 2022-11-23T03:12:19.0891361Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0891466Z self.run_subtests( 2022-11-23T03:12:19.0891951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0892281Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0892641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0892787Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0893148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0893261Z output = model(*input) 2022-11-23T03:12:19.0893576Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0893713Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0894068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0894237Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0894595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0894712Z _lazy_init(state, module) 2022-11-23T03:12:19.0895210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0895348Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0895663Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0895778Z return func(*args, **kwargs) 2022-11-23T03:12:19.0896124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0896217Z p_assert( 2022-11-23T03:12:19.0896528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0896640Z traceback.print_stack() 2022-11-23T03:12:19.0896756Z File "", line 1, in 2022-11-23T03:12:19.0897184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0897328Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0897521Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0897654Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0897861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0897956Z self.run() 2022-11-23T03:12:19.0898152Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0898294Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0898636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0898759Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0899109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0899262Z getattr(self, test_name)() 2022-11-23T03:12:19.0899616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0899705Z fn() 2022-11-23T03:12:19.0900062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0900176Z test(self, **param_kwargs) 2022-11-23T03:12:19.0900520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0900637Z return func(*args, **kwargs) 2022-11-23T03:12:19.0901075Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0901168Z self.run_subtests( 2022-11-23T03:12:19.0901499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0901655Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0901999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0902138Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0902489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0902594Z output = model(*input) 2022-11-23T03:12:19.0902897Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0903015Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0903368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0903528Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0904294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0904412Z _lazy_init(state, module) 2022-11-23T03:12:19.0904759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0904893Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0905218Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0905324Z return func(*args, **kwargs) 2022-11-23T03:12:19.0905690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0905787Z p_assert( 2022-11-23T03:12:19.0906117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0906235Z traceback.print_stack() 2022-11-23T03:12:19.0906360Z File "", line 1, in 2022-11-23T03:12:19.0906671Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0906804Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0906997Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0907140Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0907344Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0907438Z self.run() 2022-11-23T03:12:19.0907630Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0907770Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0908104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0908219Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0908728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0908900Z getattr(self, test_name)() 2022-11-23T03:12:19.0909235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0909318Z fn() 2022-11-23T03:12:19.0909658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0909767Z test(self, **param_kwargs) 2022-11-23T03:12:19.0910286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0910392Z return func(*args, **kwargs) 2022-11-23T03:12:19.0910678Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0910782Z self.run_subtests( 2022-11-23T03:12:19.0911122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0911282Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0911637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0911786Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0912150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0912251Z output = model(*input) 2022-11-23T03:12:19.0912570Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0912710Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0913227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0913396Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0913747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0913859Z _lazy_init(state, module) 2022-11-23T03:12:19.0914190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0914310Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0914630Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0914743Z return func(*args, **kwargs) 2022-11-23T03:12:19.0915101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0915190Z p_assert( 2022-11-23T03:12:19.0915508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0915625Z traceback.print_stack() 2022-11-23T03:12:19.0915745Z File "", line 1, in 2022-11-23T03:12:19.0915973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0916117Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0916303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0916440Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0916636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0916729Z self.run() 2022-11-23T03:12:19.0916917Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0917041Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0917362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0917481Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0917881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0917989Z getattr(self, test_name)() 2022-11-23T03:12:19.0918323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0918409Z fn() 2022-11-23T03:12:19.0918531Z File "", line 1, in 2022-11-23T03:12:19.0918864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0918972Z test(self, **param_kwargs) 2022-11-23T03:12:19.0919305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0919416Z return func(*args, **kwargs) 2022-11-23T03:12:19.0919606Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0919737Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0920016Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0920119Z self.run_subtests( 2022-11-23T03:12:19.0920295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0920431Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0920764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0920910Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0921108Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0921200Z self.run() 2022-11-23T03:12:19.0921316Z File "", line 1, in 2022-11-23T03:12:19.0921655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0921788Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0921978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0922112Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0922464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0922570Z output = model(*input) 2022-11-23T03:12:19.0922764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0922894Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0923200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0923321Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0923623Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0923755Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0924171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0924319Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0924669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0924787Z getattr(self, test_name)() 2022-11-23T03:12:19.0925143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0925309Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0925512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0925607Z self.run() 2022-11-23T03:12:19.0925955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0926091Z fn() 2022-11-23T03:12:19.0926452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0926570Z _lazy_init(state, module) 2022-11-23T03:12:19.0926914Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0927048Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0927388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0927498Z test(self, **param_kwargs) 2022-11-23T03:12:19.0927824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0927950Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0928263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0928382Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0928907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0929025Z return func(*args, **kwargs) 2022-11-23T03:12:19.0929354Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0929468Z return func(*args, **kwargs) 2022-11-23T03:12:19.0929813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0929926Z getattr(self, test_name)() 2022-11-23T03:12:19.0930209Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0930314Z self.run_subtests( 2022-11-23T03:12:19.0930671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0930768Z p_assert( 2022-11-23T03:12:19.0931121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0931210Z fn() 2022-11-23T03:12:19.0931593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0931910Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0932405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0932523Z traceback.print_stack() 2022-11-23T03:12:19.0932867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0932981Z test(self, **param_kwargs) 2022-11-23T03:12:19.0933324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0933472Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0933864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0933988Z return func(*args, **kwargs) 2022-11-23T03:12:19.0934355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0934466Z output = model(*input) 2022-11-23T03:12:19.0934743Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0934846Z self.run_subtests( 2022-11-23T03:12:19.0935313Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0935441Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0935774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0935969Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0936513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0936684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0937025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0937171Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0937526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0937638Z _lazy_init(state, module) 2022-11-23T03:12:19.0938004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0938117Z output = model(*input) 2022-11-23T03:12:19.0938469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0938603Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0938910Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0939039Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0939523Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0939634Z return func(*args, **kwargs) 2022-11-23T03:12:19.0939987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0940146Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0940679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0940777Z p_assert( 2022-11-23T03:12:19.0941126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0941238Z _lazy_init(state, module) 2022-11-23T03:12:19.0941567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0941683Z traceback.print_stack() 2022-11-23T03:12:19.0942024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0942158Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0942482Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0942597Z return func(*args, **kwargs) 2022-11-23T03:12:19.0942954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0943047Z p_assert( 2022-11-23T03:12:19.0943580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0943704Z traceback.print_stack() 2022-11-23T03:12:19.0944212Z File "", line 1, in 2022-11-23T03:12:19.0944414Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0944547Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0944739Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0944872Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0945076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0945171Z self.run() 2022-11-23T03:12:19.0945363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0945499Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0945838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0946057Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0946405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0946521Z getattr(self, test_name)() 2022-11-23T03:12:19.0946869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0946957Z fn() 2022-11-23T03:12:19.0947311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0947424Z test(self, **param_kwargs) 2022-11-23T03:12:19.0947770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0947885Z return func(*args, **kwargs) 2022-11-23T03:12:19.0948166Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0948272Z self.run_subtests( 2022-11-23T03:12:19.0948614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0948764Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0949119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0949264Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0949631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0949737Z output = model(*input) 2022-11-23T03:12:19.0950041Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0950176Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0950548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0950871Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0951213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0951322Z _lazy_init(state, module) 2022-11-23T03:12:19.0951648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0951780Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0952093Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0952197Z return func(*args, **kwargs) 2022-11-23T03:12:19.0952550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0952644Z p_assert( 2022-11-23T03:12:19.0953020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0953142Z traceback.print_stack() 2022-11-23T03:12:19.0953256Z File "", line 1, in 2022-11-23T03:12:19.0953368Z File "", line 1, in 2022-11-23T03:12:19.0953552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0953685Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0953877Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0954004Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0954190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0954323Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0954511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0954694Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0954883Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0954974Z self.run() 2022-11-23T03:12:19.0955163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0955251Z self.run() 2022-11-23T03:12:19.0955438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0955573Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0955758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0956046Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0956388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0956514Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0956840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0956969Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0957318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0957432Z getattr(self, test_name)() 2022-11-23T03:12:19.0957779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0957884Z getattr(self, test_name)() 2022-11-23T03:12:19.0958231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0958319Z fn() 2022-11-23T03:12:19.0958668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0958755Z fn() 2022-11-23T03:12:19.0959106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0959229Z test(self, **param_kwargs) 2022-11-23T03:12:19.0959581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0959845Z test(self, **param_kwargs) 2022-11-23T03:12:19.0960178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0960290Z return func(*args, **kwargs) 2022-11-23T03:12:19.0960620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0960732Z return func(*args, **kwargs) 2022-11-23T03:12:19.0961007Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0961107Z self.run_subtests( 2022-11-23T03:12:19.0961429Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0961530Z self.run_subtests( 2022-11-23T03:12:19.0961866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0962015Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0962346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0962489Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0962828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0962967Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0963300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0963471Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0963826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0963934Z output = model(*input) 2022-11-23T03:12:19.0964284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0964391Z output = model(*input) 2022-11-23T03:12:19.0964696Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0964823Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0965129Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0965245Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0965597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0965767Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0966127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0966286Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0966627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0966735Z _lazy_init(state, module) 2022-11-23T03:12:19.0967079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0967349Z _lazy_init(state, module) 2022-11-23T03:12:19.0967692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0967833Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0968176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0968308Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0968635Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0968750Z return func(*args, **kwargs) 2022-11-23T03:12:19.0969077Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0969183Z return func(*args, **kwargs) 2022-11-23T03:12:19.0969549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0969640Z p_assert( 2022-11-23T03:12:19.0970010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0970101Z p_assert( 2022-11-23T03:12:19.0970428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0970747Z traceback.print_stack() 2022-11-23T03:12:19.0971248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0971355Z traceback.print_stack() 2022-11-23T03:12:19.0971475Z File "", line 1, in 2022-11-23T03:12:19.0971676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0971808Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0972000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0972141Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0972347Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0972444Z self.run() 2022-11-23T03:12:19.0972629Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0972822Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0973155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0973280Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0973629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0973747Z getattr(self, test_name)() 2022-11-23T03:12:19.0974095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0974174Z fn() 2022-11-23T03:12:19.0974526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0974640Z test(self, **param_kwargs) 2022-11-23T03:12:19.0974984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0975102Z return func(*args, **kwargs) 2022-11-23T03:12:19.0975388Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0975498Z self.run_subtests( 2022-11-23T03:12:19.0975840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0975981Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0976333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0976476Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0976839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0976948Z output = model(*input) 2022-11-23T03:12:19.0977264Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0977735Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0978102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0978264Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0978608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0978720Z _lazy_init(state, module) 2022-11-23T03:12:19.0979059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0979192Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0979514Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0979631Z return func(*args, **kwargs) 2022-11-23T03:12:19.0980048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0980148Z p_assert( 2022-11-23T03:12:19.0980467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0980586Z traceback.print_stack() 2022-11-23T03:12:19.0980864Z File "", line 1, in 2022-11-23T03:12:19.0981060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0981189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0981373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0981509Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0981694Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0981785Z self.run() 2022-11-23T03:12:19.0982036Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0982170Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0982494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0982617Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0982960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0983071Z getattr(self, test_name)() 2022-11-23T03:12:19.0983398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0983482Z fn() 2022-11-23T03:12:19.0983822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0984128Z test(self, **param_kwargs) 2022-11-23T03:12:19.0984657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0984783Z return func(*args, **kwargs) 2022-11-23T03:12:19.0985071Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0985176Z self.run_subtests( 2022-11-23T03:12:19.0985508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0985660Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0986011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0986158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0986523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0986633Z output = model(*input) 2022-11-23T03:12:19.0986959Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0987093Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0987451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0987617Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0988132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0988242Z _lazy_init(state, module) 2022-11-23T03:12:19.0988808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0988947Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0989272Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0989390Z return func(*args, **kwargs) 2022-11-23T03:12:19.0989817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0989922Z p_assert( 2022-11-23T03:12:19.0990247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0990363Z traceback.print_stack() 2022-11-23T03:12:19.0990485Z File "", line 1, in 2022-11-23T03:12:19.0990683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.0990815Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.0990997Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.0991140Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.0991340Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.0991651Z self.run() 2022-11-23T03:12:19.0991843Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.0991976Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.0992294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.0992591Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.0992934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.0993046Z getattr(self, test_name)() 2022-11-23T03:12:19.0993390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.0993479Z fn() 2022-11-23T03:12:19.0993829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.0993943Z test(self, **param_kwargs) 2022-11-23T03:12:19.0994292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.0994408Z return func(*args, **kwargs) 2022-11-23T03:12:19.0994685Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.0994791Z self.run_subtests( 2022-11-23T03:12:19.0995134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.0995288Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.0995791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.0995929Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.0996280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.0996389Z output = model(*input) 2022-11-23T03:12:19.0996689Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.0996819Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.0997349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.0997513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.0997870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.0997982Z _lazy_init(state, module) 2022-11-23T03:12:19.0998324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.0998460Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.0998776Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.0998944Z return func(*args, **kwargs) 2022-11-23T03:12:19.0999324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.0999417Z p_assert( 2022-11-23T03:12:19.0999740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.0999854Z traceback.print_stack() 2022-11-23T03:12:19.0999974Z File "", line 1, in 2022-11-23T03:12:19.1000178Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1000301Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1000494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1000635Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1000838Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1000981Z self.run() 2022-11-23T03:12:19.1001334Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1001466Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1001776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1001894Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1002236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1002347Z getattr(self, test_name)() 2022-11-23T03:12:19.1002684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1002772Z fn() 2022-11-23T03:12:19.1003112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1003228Z test(self, **param_kwargs) 2022-11-23T03:12:19.1003555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1003670Z return func(*args, **kwargs) 2022-11-23T03:12:19.1003950Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1004052Z self.run_subtests( 2022-11-23T03:12:19.1004382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1004531Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1004869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1005185Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1005550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1005660Z output = model(*input) 2022-11-23T03:12:19.1005977Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1006107Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1006478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1006643Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1007003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1007118Z _lazy_init(state, module) 2022-11-23T03:12:19.1007460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1007584Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1007915Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1008082Z return func(*args, **kwargs) 2022-11-23T03:12:19.1008462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1008558Z p_assert( 2022-11-23T03:12:19.1009045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1009158Z traceback.print_stack() 2022-11-23T03:12:19.1009266Z File "", line 1, in 2022-11-23T03:12:19.1009455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1009580Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1009762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1009900Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1010095Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1010243Z self.run() 2022-11-23T03:12:19.1010606Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1010733Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1011060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1011185Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1011535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1011649Z getattr(self, test_name)() 2022-11-23T03:12:19.1011993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1012083Z fn() 2022-11-23T03:12:19.1012437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1012548Z test(self, **param_kwargs) 2022-11-23T03:12:19.1012901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1013016Z return func(*args, **kwargs) 2022-11-23T03:12:19.1013460Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1013561Z self.run_subtests( 2022-11-23T03:12:19.1013891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1014038Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1014376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1014506Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1014858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1014971Z output = model(*input) 2022-11-23T03:12:19.1015278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1015408Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1015763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1015922Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1016261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1016360Z _lazy_init(state, module) 2022-11-23T03:12:19.1016686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1016819Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1017181Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1017297Z return func(*args, **kwargs) 2022-11-23T03:12:19.1017652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1017742Z p_assert( 2022-11-23T03:12:19.1018056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1018160Z traceback.print_stack() 2022-11-23T03:12:19.1018279Z File "", line 1, in 2022-11-23T03:12:19.1018392Z File "", line 1, in 2022-11-23T03:12:19.1018588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1018714Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1018901Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1019087Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1019276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1019403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1019595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1019686Z self.run() 2022-11-23T03:12:19.1019866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1020007Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1020197Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1020334Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1020519Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1020614Z self.run() 2022-11-23T03:12:19.1020934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1021058Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1021245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1021374Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1021718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1022005Z getattr(self, test_name)() 2022-11-23T03:12:19.1022322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1022445Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1022790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1022878Z fn() 2022-11-23T03:12:19.1023226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1023341Z getattr(self, test_name)() 2022-11-23T03:12:19.1023700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1023805Z test(self, **param_kwargs) 2022-11-23T03:12:19.1024380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1024469Z fn() 2022-11-23T03:12:19.1024813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1024924Z return func(*args, **kwargs) 2022-11-23T03:12:19.1025272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1025382Z test(self, **param_kwargs) 2022-11-23T03:12:19.1025669Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1025769Z self.run_subtests( 2022-11-23T03:12:19.1026180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1026308Z return func(*args, **kwargs) 2022-11-23T03:12:19.1026648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1026802Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1027248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1027350Z self.run_subtests( 2022-11-23T03:12:19.1027691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1027821Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1028218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1028370Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1028725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1029009Z output = model(*input) 2022-11-23T03:12:19.1029362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1029506Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1029825Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1029955Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1030308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1030422Z output = model(*input) 2022-11-23T03:12:19.1030792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1030962Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1031278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1031412Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1031811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1031924Z _lazy_init(state, module) 2022-11-23T03:12:19.1032439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1032775Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1033115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1033256Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1033614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1033725Z _lazy_init(state, module) 2022-11-23T03:12:19.1034049Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1034165Z return func(*args, **kwargs) 2022-11-23T03:12:19.1034494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1034624Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1034990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1035082Z p_assert( 2022-11-23T03:12:19.1035564Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1035725Z return func(*args, **kwargs) 2022-11-23T03:12:19.1036056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1036167Z traceback.print_stack() 2022-11-23T03:12:19.1036691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1036785Z p_assert( 2022-11-23T03:12:19.1037103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1037217Z traceback.print_stack() 2022-11-23T03:12:19.1037334Z File "", line 1, in 2022-11-23T03:12:19.1037537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1037673Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1037857Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1038051Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1038255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1038351Z self.run() 2022-11-23T03:12:19.1038543Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1038677Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1039008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1039133Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1039472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1039749Z getattr(self, test_name)() 2022-11-23T03:12:19.1040086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1040354Z fn() 2022-11-23T03:12:19.1040714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1040828Z test(self, **param_kwargs) 2022-11-23T03:12:19.1041174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1041291Z return func(*args, **kwargs) 2022-11-23T03:12:19.1041566Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1041673Z self.run_subtests( 2022-11-23T03:12:19.1041792Z File "", line 1, in 2022-11-23T03:12:19.1042133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1042286Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1042644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1042796Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1042997Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1043284Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1043637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1043743Z output = model(*input) 2022-11-23T03:12:19.1043926Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1044061Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1044545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1044675Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1044877Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1044968Z self.run() 2022-11-23T03:12:19.1045376Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1045548Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1045744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1045877Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1046235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1046346Z _lazy_init(state, module) 2022-11-23T03:12:19.1046672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1046786Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1047127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1047367Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1047720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1047835Z getattr(self, test_name)() 2022-11-23T03:12:19.1048166Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1048286Z return func(*args, **kwargs) 2022-11-23T03:12:19.1048625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1048716Z fn() 2022-11-23T03:12:19.1049086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1049177Z p_assert( 2022-11-23T03:12:19.1049527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1049643Z test(self, **param_kwargs) 2022-11-23T03:12:19.1049969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1050086Z traceback.print_stack() 2022-11-23T03:12:19.1050425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1050540Z return func(*args, **kwargs) 2022-11-23T03:12:19.1050824Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1051091Z self.run_subtests( 2022-11-23T03:12:19.1051426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1051575Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1051916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1052063Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1052405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1052512Z output = model(*input) 2022-11-23T03:12:19.1052819Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1052945Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1053296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1053456Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1053798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1053904Z _lazy_init(state, module) 2022-11-23T03:12:19.1054285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1054410Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1054730Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1054844Z return func(*args, **kwargs) 2022-11-23T03:12:19.1055200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1055288Z p_assert( 2022-11-23T03:12:19.1055602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1055712Z traceback.print_stack() 2022-11-23T03:12:19.1055820Z File "", line 1, in 2022-11-23T03:12:19.1055933Z File "", line 1, in 2022-11-23T03:12:19.1056128Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1056480Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1056687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1056816Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1057006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1057148Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1057330Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1057472Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1057674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1057767Z self.run() 2022-11-23T03:12:19.1057962Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1058055Z self.run() 2022-11-23T03:12:19.1058248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1058384Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1058578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1058714Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1059049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1059175Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1059499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1059618Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1059967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1060072Z getattr(self, test_name)() 2022-11-23T03:12:19.1060419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1060538Z getattr(self, test_name)() 2022-11-23T03:12:19.1060887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1060976Z fn() 2022-11-23T03:12:19.1061322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1061408Z fn() 2022-11-23T03:12:19.1061761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1061867Z test(self, **param_kwargs) 2022-11-23T03:12:19.1062377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1062488Z test(self, **param_kwargs) 2022-11-23T03:12:19.1062820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1062939Z return func(*args, **kwargs) 2022-11-23T03:12:19.1063312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1063428Z return func(*args, **kwargs) 2022-11-23T03:12:19.1063707Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1063799Z self.run_subtests( 2022-11-23T03:12:19.1064278Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1064384Z self.run_subtests( 2022-11-23T03:12:19.1064724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1064870Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1065198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1065418Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1065853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1066162Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1066518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1066656Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1067021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1067128Z output = model(*input) 2022-11-23T03:12:19.1067490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1067604Z output = model(*input) 2022-11-23T03:12:19.1067920Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1068045Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1068361Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1068492Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1068857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1069023Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1069383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1069547Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1069904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1070014Z _lazy_init(state, module) 2022-11-23T03:12:19.1070368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1070478Z _lazy_init(state, module) 2022-11-23T03:12:19.1070759Z File "", line 1, in 2022-11-23T03:12:19.1071088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1071217Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1071540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1071669Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1071854Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1071982Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1072555Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1072676Z return func(*args, **kwargs) 2022-11-23T03:12:19.1073004Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1073118Z return func(*args, **kwargs) 2022-11-23T03:12:19.1073309Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1073452Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1073808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1073903Z p_assert( 2022-11-23T03:12:19.1074267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1074359Z p_assert( 2022-11-23T03:12:19.1074563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1074708Z self.run() 2022-11-23T03:12:19.1075035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1075154Z traceback.print_stack() 2022-11-23T03:12:19.1075472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1075588Z traceback.print_stack() 2022-11-23T03:12:19.1075780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1075918Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1076244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1076368Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1076719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1076829Z getattr(self, test_name)() 2022-11-23T03:12:19.1077180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1077269Z fn() 2022-11-23T03:12:19.1077622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1077735Z test(self, **param_kwargs) 2022-11-23T03:12:19.1078079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1078195Z return func(*args, **kwargs) 2022-11-23T03:12:19.1078479Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1078576Z self.run_subtests( 2022-11-23T03:12:19.1078915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1079074Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1079429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1079575Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1079938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1080048Z output = model(*input) 2022-11-23T03:12:19.1080362Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1080486Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1080848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1081014Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1081371Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1081535Z _lazy_init(state, module) 2022-11-23T03:12:19.1081883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1082016Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1082340Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1082453Z return func(*args, **kwargs) 2022-11-23T03:12:19.1082809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1082902Z p_assert( 2022-11-23T03:12:19.1083225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1083338Z traceback.print_stack() 2022-11-23T03:12:19.1083456Z File "", line 1, in 2022-11-23T03:12:19.1083870Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1084002Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1084178Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1084313Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1084509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1084600Z self.run() 2022-11-23T03:12:19.1084786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1084917Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1085233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1085351Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1085680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1085797Z getattr(self, test_name)() 2022-11-23T03:12:19.1086139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1086226Z fn() 2022-11-23T03:12:19.1086568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1086679Z test(self, **param_kwargs) 2022-11-23T03:12:19.1087009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1087121Z return func(*args, **kwargs) 2022-11-23T03:12:19.1087391Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1087492Z self.run_subtests( 2022-11-23T03:12:19.1087822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1087979Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1088323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1088461Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1089049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1089161Z output = model(*input) 2022-11-23T03:12:19.1089464Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1089596Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1089961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1090125Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1090531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1090647Z _lazy_init(state, module) 2022-11-23T03:12:19.1090993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1091125Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1091441Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1091554Z return func(*args, **kwargs) 2022-11-23T03:12:19.1092075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1092164Z p_assert( 2022-11-23T03:12:19.1092481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1092593Z traceback.print_stack() 2022-11-23T03:12:19.1092932Z File "", line 1, in 2022-11-23T03:12:19.1093138Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1093263Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1093454Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1093595Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1093796Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1093890Z self.run() 2022-11-23T03:12:19.1094080Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1094215Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1094536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1094658Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1095004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1095129Z getattr(self, test_name)() 2022-11-23T03:12:19.1095475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1095564Z fn() 2022-11-23T03:12:19.1095918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1096030Z test(self, **param_kwargs) 2022-11-23T03:12:19.1096368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1096480Z return func(*args, **kwargs) 2022-11-23T03:12:19.1096763Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1096867Z self.run_subtests( 2022-11-23T03:12:19.1097363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1097518Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1098052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1098197Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1098553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1098662Z output = model(*input) 2022-11-23T03:12:19.1098977Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1099109Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1099473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1099638Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1100044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1100159Z _lazy_init(state, module) 2022-11-23T03:12:19.1100493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1100626Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1100949Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1101065Z return func(*args, **kwargs) 2022-11-23T03:12:19.1101590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1101681Z p_assert( 2022-11-23T03:12:19.1102177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1102342Z traceback.print_stack() 2022-11-23T03:12:19.1102453Z File "", line 1, in 2022-11-23T03:12:19.1102656Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1102790Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1102982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1103125Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1103327Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1103421Z self.run() 2022-11-23T03:12:19.1103604Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1103743Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1104285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1104411Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1104932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1105043Z getattr(self, test_name)() 2022-11-23T03:12:19.1105555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1105642Z fn() 2022-11-23T03:12:19.1105988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1106100Z test(self, **param_kwargs) 2022-11-23T03:12:19.1106441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1106558Z return func(*args, **kwargs) 2022-11-23T03:12:19.1106846Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1106948Z self.run_subtests( 2022-11-23T03:12:19.1107292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1107444Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1107789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1107936Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1108304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1108411Z output = model(*input) 2022-11-23T03:12:19.1108724Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1108855Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1109220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1109389Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1109806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1109917Z _lazy_init(state, module) 2022-11-23T03:12:19.1110262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1110392Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1110714Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1110826Z return func(*args, **kwargs) 2022-11-23T03:12:19.1111194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1111285Z p_assert( 2022-11-23T03:12:19.1111611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1111795Z traceback.print_stack() 2022-11-23T03:12:19.1111918Z File "", line 1, in 2022-11-23T03:12:19.1112115Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1112249Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1112442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1112586Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1112909Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1112995Z self.run() 2022-11-23T03:12:19.1113191Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1113325Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1113656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1113783Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1114137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1114250Z getattr(self, test_name)() 2022-11-23T03:12:19.1114598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1114678Z fn() 2022-11-23T03:12:19.1115031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1115145Z test(self, **param_kwargs) 2022-11-23T03:12:19.1115489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1115603Z return func(*args, **kwargs) 2022-11-23T03:12:19.1115888Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1115998Z self.run_subtests( 2022-11-23T03:12:19.1116344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1116488Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1116842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1116989Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1117353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1117462Z output = model(*input) 2022-11-23T03:12:19.1117775Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1117904Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1118420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1118621Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1118976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1119083Z _lazy_init(state, module) 2022-11-23T03:12:19.1119412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1119540Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1119853Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1119964Z return func(*args, **kwargs) 2022-11-23T03:12:19.1120079Z File "", line 1, in 2022-11-23T03:12:19.1120422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1120561Z p_assert( 2022-11-23T03:12:19.1120881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1120996Z traceback.print_stack() 2022-11-23T03:12:19.1121187Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1121316Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1121500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1121628Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1121822Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1121911Z self.run() 2022-11-23T03:12:19.1122101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1122232Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1122546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1122666Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1123006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1123110Z getattr(self, test_name)() 2022-11-23T03:12:19.1123447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1123531Z fn() 2022-11-23T03:12:19.1123873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1123980Z test(self, **param_kwargs) 2022-11-23T03:12:19.1124314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1124597Z return func(*args, **kwargs) 2022-11-23T03:12:19.1124885Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1124985Z self.run_subtests( 2022-11-23T03:12:19.1125329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1125483Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1125835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1125980Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1126345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1126457Z output = model(*input) 2022-11-23T03:12:19.1126773Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1126895Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1127258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1127643Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1128002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1128109Z _lazy_init(state, module) 2022-11-23T03:12:19.1128438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1128566Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1128881Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1128984Z return func(*args, **kwargs) 2022-11-23T03:12:19.1129524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1129616Z p_assert( 2022-11-23T03:12:19.1129999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1130114Z traceback.print_stack() 2022-11-23T03:12:19.1130235Z File "", line 1, in 2022-11-23T03:12:19.1130433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1130565Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1130748Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1130888Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1131090Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1131183Z self.run() 2022-11-23T03:12:19.1131374Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1131509Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1131889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1132012Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1132363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1132475Z getattr(self, test_name)() 2022-11-23T03:12:19.1133149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1133236Z fn() 2022-11-23T03:12:19.1133355Z File "", line 1, in 2022-11-23T03:12:19.1133710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1133824Z test(self, **param_kwargs) 2022-11-23T03:12:19.1134160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1134274Z return func(*args, **kwargs) 2022-11-23T03:12:19.1134479Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1134609Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1134893Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1134995Z self.run_subtests( 2022-11-23T03:12:19.1135187Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1135328Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1135660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1135975Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1136169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1136257Z self.run() 2022-11-23T03:12:19.1136644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1136793Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1137168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1137303Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1137663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1137771Z output = model(*input) 2022-11-23T03:12:19.1138097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1138217Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1138529Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1138659Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1139010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1139177Z getattr(self, test_name)() 2022-11-23T03:12:19.1139533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1139701Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1140207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1140296Z fn() 2022-11-23T03:12:19.1140822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1140933Z _lazy_init(state, module) 2022-11-23T03:12:19.1141283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1141394Z test(self, **param_kwargs) 2022-11-23T03:12:19.1141729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1141861Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1142203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1142318Z return func(*args, **kwargs) 2022-11-23T03:12:19.1142641Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1142753Z return func(*args, **kwargs) 2022-11-23T03:12:19.1143044Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1143184Z self.run_subtests( 2022-11-23T03:12:19.1143727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1143821Z p_assert( 2022-11-23T03:12:19.1144363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1144682Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1145009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1145123Z traceback.print_stack() 2022-11-23T03:12:19.1145471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1145614Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1145968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1146077Z output = model(*input) 2022-11-23T03:12:19.1146389Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1146524Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1146959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1147136Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1147492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1147603Z _lazy_init(state, module) 2022-11-23T03:12:19.1147933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1148066Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1148393Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1148509Z return func(*args, **kwargs) 2022-11-23T03:12:19.1148878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1149029Z p_assert( 2022-11-23T03:12:19.1149359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1149473Z traceback.print_stack() 2022-11-23T03:12:19.1149585Z File "", line 1, in 2022-11-23T03:12:19.1149786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1149918Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1150110Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1150248Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1150451Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1150544Z self.run() 2022-11-23T03:12:19.1150729Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1150869Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1151199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1151650Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1151998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1152108Z getattr(self, test_name)() 2022-11-23T03:12:19.1152453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1152542Z fn() 2022-11-23T03:12:19.1152885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1152999Z test(self, **param_kwargs) 2022-11-23T03:12:19.1153343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1153462Z return func(*args, **kwargs) 2022-11-23T03:12:19.1153751Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1153858Z self.run_subtests( 2022-11-23T03:12:19.1154198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1154350Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1154858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1154999Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1155350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1155455Z output = model(*input) 2022-11-23T03:12:19.1155756Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1155888Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1156289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1156455Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1156968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1157080Z _lazy_init(state, module) 2022-11-23T03:12:19.1157419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1157553Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1157877Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1157994Z return func(*args, **kwargs) 2022-11-23T03:12:19.1158358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1158505Z p_assert( 2022-11-23T03:12:19.1158825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1158943Z traceback.print_stack() 2022-11-23T03:12:19.1159063Z File "", line 1, in 2022-11-23T03:12:19.1159261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1159392Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1159584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1159723Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1160085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1160169Z self.run() 2022-11-23T03:12:19.1160354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1160489Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1160808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1160931Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1161269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1161378Z getattr(self, test_name)() 2022-11-23T03:12:19.1161704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1161789Z fn() 2022-11-23T03:12:19.1162128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1162238Z test(self, **param_kwargs) 2022-11-23T03:12:19.1162571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1162685Z return func(*args, **kwargs) 2022-11-23T03:12:19.1162963Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1163066Z self.run_subtests( 2022-11-23T03:12:19.1163386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1163536Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1163875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1164016Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1164365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1164471Z output = model(*input) 2022-11-23T03:12:19.1164772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1164946Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1165308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1165460Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1165801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1165907Z _lazy_init(state, module) 2022-11-23T03:12:19.1166234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1166359Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1166670Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1166780Z return func(*args, **kwargs) 2022-11-23T03:12:19.1167193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1167275Z p_assert( 2022-11-23T03:12:19.1167584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1167696Z traceback.print_stack() 2022-11-23T03:12:19.1167811Z File "", line 1, in 2022-11-23T03:12:19.1168177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1168307Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1168501Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1168633Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1168835Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1168928Z self.run() 2022-11-23T03:12:19.1169130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1169268Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1169600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1169725Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1170076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1170181Z getattr(self, test_name)() 2022-11-23T03:12:19.1170528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1170620Z fn() 2022-11-23T03:12:19.1171133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1171242Z test(self, **param_kwargs) 2022-11-23T03:12:19.1171576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1171692Z return func(*args, **kwargs) 2022-11-23T03:12:19.1171968Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1172060Z self.run_subtests( 2022-11-23T03:12:19.1172566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1172718Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1173074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1173215Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1173578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1173688Z output = model(*input) 2022-11-23T03:12:19.1174057Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1174187Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1174554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1174719Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1175073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1175187Z _lazy_init(state, module) 2022-11-23T03:12:19.1175525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1175659Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1175986Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1176153Z return func(*args, **kwargs) 2022-11-23T03:12:19.1176528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1176621Z p_assert( 2022-11-23T03:12:19.1176945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1177063Z traceback.print_stack() 2022-11-23T03:12:19.1177180Z File "", line 1, in 2022-11-23T03:12:19.1177378Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1177510Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1177695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1177837Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1178200Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1178297Z self.run() 2022-11-23T03:12:19.1178486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1178621Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1179124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1179238Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1179584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1179696Z getattr(self, test_name)() 2022-11-23T03:12:19.1180042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1180129Z fn() 2022-11-23T03:12:19.1180479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1180593Z test(self, **param_kwargs) 2022-11-23T03:12:19.1180943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1181049Z return func(*args, **kwargs) 2022-11-23T03:12:19.1181336Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1181445Z self.run_subtests( 2022-11-23T03:12:19.1181940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1182089Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1182427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1182563Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1183086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1183191Z output = model(*input) 2022-11-23T03:12:19.1183559Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1183702Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1184267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1184441Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1184797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1184907Z _lazy_init(state, module) 2022-11-23T03:12:19.1185244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1185367Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1185695Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1186039Z return func(*args, **kwargs) 2022-11-23T03:12:19.1186398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1186488Z p_assert( 2022-11-23T03:12:19.1186800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1186911Z traceback.print_stack() 2022-11-23T03:12:19.1187027Z File "", line 1, in 2022-11-23T03:12:19.1187211Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1187340Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1187526Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1187661Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1187777Z File "", line 1, in 2022-11-23T03:12:19.1187976Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1188070Z self.run() 2022-11-23T03:12:19.1188248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1188377Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1188571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1188697Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1189256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1189382Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1189573Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1189714Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1190057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1190177Z getattr(self, test_name)() 2022-11-23T03:12:19.1190380Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1190471Z self.run() 2022-11-23T03:12:19.1190819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1190907Z fn() 2022-11-23T03:12:19.1191103Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1191238Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1191586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1191700Z test(self, **param_kwargs) 2022-11-23T03:12:19.1192021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1192298Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1192732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1192851Z return func(*args, **kwargs) 2022-11-23T03:12:19.1193365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1193479Z getattr(self, test_name)() 2022-11-23T03:12:19.1193755Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1193858Z self.run_subtests( 2022-11-23T03:12:19.1194208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1194293Z fn() 2022-11-23T03:12:19.1194633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1194785Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1195192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1195303Z test(self, **param_kwargs) 2022-11-23T03:12:19.1195642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1195786Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1196130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1196403Z return func(*args, **kwargs) 2022-11-23T03:12:19.1196754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1196861Z output = model(*input) 2022-11-23T03:12:19.1197138Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1197244Z self.run_subtests( 2022-11-23T03:12:19.1197546Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1197674Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1198187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1198338Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1198700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1198867Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1199216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1199359Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1199714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1199829Z _lazy_init(state, module) 2022-11-23T03:12:19.1200193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1200300Z output = model(*input) 2022-11-23T03:12:19.1200637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1200767Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1201079Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1201210Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1201532Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1201648Z return func(*args, **kwargs) 2022-11-23T03:12:19.1202217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1202383Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1202740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1202832Z p_assert( 2022-11-23T03:12:19.1203177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1203284Z _lazy_init(state, module) 2022-11-23T03:12:19.1203589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1203707Z traceback.print_stack() 2022-11-23T03:12:19.1204215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1204347Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1204728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1204846Z return func(*args, **kwargs) 2022-11-23T03:12:19.1205215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1205313Z p_assert( 2022-11-23T03:12:19.1205625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1205741Z traceback.print_stack() 2022-11-23T03:12:19.1205862Z File "", line 1, in 2022-11-23T03:12:19.1206061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1206193Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1206383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1206524Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1206730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1206822Z self.run() 2022-11-23T03:12:19.1207014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1207150Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1207478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1207605Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1207958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1208069Z getattr(self, test_name)() 2022-11-23T03:12:19.1208410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1208498Z fn() 2022-11-23T03:12:19.1208849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1208971Z test(self, **param_kwargs) 2022-11-23T03:12:19.1209313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1209428Z return func(*args, **kwargs) 2022-11-23T03:12:19.1209870Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1209969Z self.run_subtests( 2022-11-23T03:12:19.1210290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1210437Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1210784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1210920Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1211509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1211626Z output = model(*input) 2022-11-23T03:12:19.1211941Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1212071Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1212427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1212596Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1212949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1213064Z _lazy_init(state, module) 2022-11-23T03:12:19.1213402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1213587Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1214083Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1214194Z return func(*args, **kwargs) 2022-11-23T03:12:19.1214540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1214629Z p_assert( 2022-11-23T03:12:19.1214942Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1215054Z traceback.print_stack() 2022-11-23T03:12:19.1215168Z File "", line 1, in 2022-11-23T03:12:19.1215361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1215489Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1215675Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1215809Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1216009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1216104Z self.run() 2022-11-23T03:12:19.1216290Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1216422Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1216742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1216864Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1217200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1217303Z getattr(self, test_name)() 2022-11-23T03:12:19.1217640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1217728Z fn() 2022-11-23T03:12:19.1218076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1218185Z test(self, **param_kwargs) 2022-11-23T03:12:19.1218521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1218633Z return func(*args, **kwargs) 2022-11-23T03:12:19.1218901Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1219005Z self.run_subtests( 2022-11-23T03:12:19.1219332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1219482Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1220004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1220153Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1220562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1220686Z output = model(*input) 2022-11-23T03:12:19.1221006Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1221132Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1221497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1221664Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1221787Z File "", line 1, in 2022-11-23T03:12:19.1222141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1222254Z _lazy_init(state, module) 2022-11-23T03:12:19.1222504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1222643Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1222976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1223109Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1223303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1223444Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1224121Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1224241Z return func(*args, **kwargs) 2022-11-23T03:12:19.1224439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1224524Z self.run() 2022-11-23T03:12:19.1224882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1225154Z p_assert( 2022-11-23T03:12:19.1225355Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1225491Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1225815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1225935Z traceback.print_stack() 2022-11-23T03:12:19.1226263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1226378Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1226727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1226845Z getattr(self, test_name)() 2022-11-23T03:12:19.1227200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1227294Z fn() 2022-11-23T03:12:19.1227649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1227764Z test(self, **param_kwargs) 2022-11-23T03:12:19.1228265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1228369Z return func(*args, **kwargs) 2022-11-23T03:12:19.1228645Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1228747Z self.run_subtests( 2022-11-23T03:12:19.1229072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1229216Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1229729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1229878Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1230312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1230421Z output = model(*input) 2022-11-23T03:12:19.1230741Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1230873Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1231241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1231407Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1231807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1231922Z _lazy_init(state, module) 2022-11-23T03:12:19.1232265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1232458Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1232792Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1233077Z return func(*args, **kwargs) 2022-11-23T03:12:19.1233614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1233707Z p_assert( 2022-11-23T03:12:19.1234029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1234145Z traceback.print_stack() 2022-11-23T03:12:19.1234267Z File "", line 1, in 2022-11-23T03:12:19.1234457Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1234595Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1234790Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1234935Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1235140Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1235235Z self.run() 2022-11-23T03:12:19.1235428Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1235553Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1235887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1236011Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1236514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1236624Z getattr(self, test_name)() 2022-11-23T03:12:19.1236959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1237046Z fn() 2022-11-23T03:12:19.1237573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1237681Z test(self, **param_kwargs) 2022-11-23T03:12:19.1238028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1238143Z return func(*args, **kwargs) 2022-11-23T03:12:19.1238427Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1238529Z self.run_subtests( 2022-11-23T03:12:19.1238871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1239021Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1239373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1239557Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1239933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1240045Z output = model(*input) 2022-11-23T03:12:19.1240357Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1240488Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1240850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1241014Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1241369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1241471Z _lazy_init(state, module) 2022-11-23T03:12:19.1241886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1242023Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1242349Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1242464Z return func(*args, **kwargs) 2022-11-23T03:12:19.1242833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1242923Z p_assert( 2022-11-23T03:12:19.1243408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1243512Z traceback.print_stack() 2022-11-23T03:12:19.1243633Z File "", line 1, in 2022-11-23T03:12:19.1243824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1243954Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1244145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1244283Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1244476Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1244565Z self.run() 2022-11-23T03:12:19.1244918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1245054Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1245386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1245510Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1245864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1245978Z getattr(self, test_name)() 2022-11-23T03:12:19.1246324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1246423Z fn() 2022-11-23T03:12:19.1246770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1246888Z test(self, **param_kwargs) 2022-11-23T03:12:19.1247232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1247347Z return func(*args, **kwargs) 2022-11-23T03:12:19.1247629Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1247732Z self.run_subtests( 2022-11-23T03:12:19.1248068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1248217Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1248612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1248766Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1249130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1249241Z output = model(*input) 2022-11-23T03:12:19.1249551Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1249684Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1250047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1250211Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1250555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1250712Z _lazy_init(state, module) 2022-11-23T03:12:19.1251053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1251184Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1251510Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1251786Z return func(*args, **kwargs) 2022-11-23T03:12:19.1252141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1252229Z p_assert( 2022-11-23T03:12:19.1252535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1252649Z traceback.print_stack() 2022-11-23T03:12:19.1252764Z File "", line 1, in 2022-11-23T03:12:19.1252958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1253089Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1253275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1253410Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1253595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1253690Z self.run() 2022-11-23T03:12:19.1253876Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1254008Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1254326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1254447Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1254782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1254890Z getattr(self, test_name)() 2022-11-23T03:12:19.1255223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1255309Z fn() 2022-11-23T03:12:19.1255649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1255758Z test(self, **param_kwargs) 2022-11-23T03:12:19.1256088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1256196Z return func(*args, **kwargs) 2022-11-23T03:12:19.1256469Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1256568Z self.run_subtests( 2022-11-23T03:12:19.1256889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1257208Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1257613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1257761Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1258124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1258232Z output = model(*input) 2022-11-23T03:12:19.1258545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1258675Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1259032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1259198Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1259550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1259706Z _lazy_init(state, module) 2022-11-23T03:12:19.1260052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1260183Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1260665Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1260776Z return func(*args, **kwargs) 2022-11-23T03:12:19.1261120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1261209Z p_assert( 2022-11-23T03:12:19.1261524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1261635Z traceback.print_stack() 2022-11-23T03:12:19.1261751Z File "", line 1, in 2022-11-23T03:12:19.1261940Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1262076Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1262261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1262389Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1262583Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1262672Z self.run() 2022-11-23T03:12:19.1262860Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1262989Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1263306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1263426Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1263755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1264059Z getattr(self, test_name)() 2022-11-23T03:12:19.1264416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1264501Z fn() 2022-11-23T03:12:19.1264839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1264947Z test(self, **param_kwargs) 2022-11-23T03:12:19.1265282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1265394Z return func(*args, **kwargs) 2022-11-23T03:12:19.1265659Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1265760Z self.run_subtests( 2022-11-23T03:12:19.1266090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1266238Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1266643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1266788Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1267139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1267244Z output = model(*input) 2022-11-23T03:12:19.1267539Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1267666Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1268013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1268349Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1268701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1268882Z _lazy_init(state, module) 2022-11-23T03:12:19.1269222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1269356Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1269673Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1269786Z return func(*args, **kwargs) 2022-11-23T03:12:19.1270152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1270242Z p_assert( 2022-11-23T03:12:19.1270565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1270679Z traceback.print_stack() 2022-11-23T03:12:19.1270799Z File "", line 1, in 2022-11-23T03:12:19.1271007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1271131Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1271321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1271461Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1271661Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1271755Z self.run() 2022-11-23T03:12:19.1271946Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1272079Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1272410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1272526Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1272874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1272989Z getattr(self, test_name)() 2022-11-23T03:12:19.1273338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1273427Z fn() 2022-11-23T03:12:19.1273780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1273892Z test(self, **param_kwargs) 2022-11-23T03:12:19.1274237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1274344Z return func(*args, **kwargs) 2022-11-23T03:12:19.1274627Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1274730Z self.run_subtests( 2022-11-23T03:12:19.1275070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1275269Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1275631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1275774Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1276137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1276238Z output = model(*input) 2022-11-23T03:12:19.1276553Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1276684Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1277047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1277210Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1277617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1277728Z _lazy_init(state, module) 2022-11-23T03:12:19.1278068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1278193Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1278520Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1278633Z return func(*args, **kwargs) 2022-11-23T03:12:19.1278996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1279086Z p_assert( 2022-11-23T03:12:19.1279408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1279523Z traceback.print_stack() 2022-11-23T03:12:19.1279628Z dist init r=3, world=4 2022-11-23T03:12:19.1279942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1280247Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1280545Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1280839Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1281289Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1281578Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1281858Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1282139Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1282417Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1282695Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1282972Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1283295Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1283399Z dist init r=0, world=4 2022-11-23T03:12:19.1283701Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1283994Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1284279Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1284561Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1285073Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1285368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1285658Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1285947Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1286235Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1286528Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1286810Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1287098Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1287198Z dist init r=1, world=4 2022-11-23T03:12:19.1287510Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1287811Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1288114Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1288565Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1288898Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1289180Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1289637Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1289974Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1290269Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1290558Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1290846Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1291137Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1291237Z dist init r=2, world=4 2022-11-23T03:12:19.1291550Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1291902Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1292200Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1292492Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1292934Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1293215Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1293493Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1293960Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1294250Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1294537Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1294825Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1295119Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1295209Z ok (6.322s) 2022-11-23T03:12:19.1295585Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27677 2022-11-23T03:12:19.1295795Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27678 2022-11-23T03:12:19.1296000Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27679 2022-11-23T03:12:19.1296197Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27680 2022-11-23T03:12:19.1296724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1296889Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1297296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1297476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1297820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1297977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1298502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1298685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1299033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1299196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1299561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1299814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1300168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1300333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1300699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1300877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1301111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.1301337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.1301567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.1301803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.1302353Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1302728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1303095Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1303462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1303668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.1304072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.1304293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.1304498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.1305475Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1305574Z warnings.warn( 2022-11-23T03:12:19.1306817Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1306934Z warnings.warn( 2022-11-23T03:12:19.1307933Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1308032Z warnings.warn( 2022-11-23T03:12:19.1309034Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1309186Z warnings.warn( 2022-11-23T03:12:19.1309305Z File "", line 1, in 2022-11-23T03:12:19.1309509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1309641Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1309826Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1309966Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1310167Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1310259Z self.run() 2022-11-23T03:12:19.1310452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1310595Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1310929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1311044Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1311394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1311506Z getattr(self, test_name)() 2022-11-23T03:12:19.1311856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1311943Z fn() 2022-11-23T03:12:19.1312296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1312408Z test(self, **param_kwargs) 2022-11-23T03:12:19.1312752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1312863Z return func(*args, **kwargs) 2022-11-23T03:12:19.1313152Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1313255Z self.run_subtests( 2022-11-23T03:12:19.1313593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1313744Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1314096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1314236Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1314748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1314847Z output = model(*input) 2022-11-23T03:12:19.1315152Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1315326Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1315689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1315847Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1316189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1316295Z _lazy_init(state, module) 2022-11-23T03:12:19.1316626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1316754Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1317061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1317172Z return func(*args, **kwargs) 2022-11-23T03:12:19.1317578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1317668Z p_assert( 2022-11-23T03:12:19.1317982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1318093Z traceback.print_stack() 2022-11-23T03:12:19.1318206Z File "", line 1, in 2022-11-23T03:12:19.1318391Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1318519Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1318701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1318834Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1319029Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1319294Z self.run() 2022-11-23T03:12:19.1319492Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1319630Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1319953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1320074Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1320422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1320534Z getattr(self, test_name)() 2022-11-23T03:12:19.1320878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1320965Z fn() 2022-11-23T03:12:19.1321317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1321428Z test(self, **param_kwargs) 2022-11-23T03:12:19.1321767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1321888Z return func(*args, **kwargs) 2022-11-23T03:12:19.1322503Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1322606Z self.run_subtests( 2022-11-23T03:12:19.1322948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1323099Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1323452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1323593Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1323952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1324065Z output = model(*input) 2022-11-23T03:12:19.1324602Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1324979Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1325502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1325942Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1326483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1326852Z _lazy_init(state, module) 2022-11-23T03:12:19.1327325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1327708Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1328193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1328631Z return func(*args, **kwargs) 2022-11-23T03:12:19.1329149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1329508Z p_assert( 2022-11-23T03:12:19.1329949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1330302Z traceback.print_stack() 2022-11-23T03:12:19.1330568Z File "", line 1, in 2022-11-23T03:12:19.1330916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1331255Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1331603Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1332012Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1332371Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1332690Z self.run() 2022-11-23T03:12:19.1333012Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1333360Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1333846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1334232Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1334734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1335098Z getattr(self, test_name)() 2022-11-23T03:12:19.1335585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1335927Z fn() 2022-11-23T03:12:19.1336390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1336757Z test(self, **param_kwargs) 2022-11-23T03:12:19.1337253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1337620Z return func(*args, **kwargs) 2022-11-23T03:12:19.1338042Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1338436Z self.run_subtests( 2022-11-23T03:12:19.1338908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1339305Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1339821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1340216Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1341077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1341446Z output = model(*input) 2022-11-23T03:12:19.1341946Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1342313Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1342830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1343251Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1344146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1344510Z _lazy_init(state, module) 2022-11-23T03:12:19.1344972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1345516Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1345997Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1346439Z return func(*args, **kwargs) 2022-11-23T03:12:19.1346947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1347304Z p_assert( 2022-11-23T03:12:19.1347744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1348094Z traceback.print_stack() 2022-11-23T03:12:19.1348353Z File "", line 1, in 2022-11-23T03:12:19.1348699Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1349041Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1349390Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1349736Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1350098Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1350408Z self.run() 2022-11-23T03:12:19.1350718Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1351064Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1351545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1351904Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1352548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1352895Z getattr(self, test_name)() 2022-11-23T03:12:19.1353362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1353689Z fn() 2022-11-23T03:12:19.1354132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1354489Z test(self, **param_kwargs) 2022-11-23T03:12:19.1354961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1355311Z return func(*args, **kwargs) 2022-11-23T03:12:19.1355717Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1356090Z self.run_subtests( 2022-11-23T03:12:19.1356547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1356933Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1357602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1358001Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1358528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1358962Z output = model(*input) 2022-11-23T03:12:19.1359422Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1359781Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1360305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1360730Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1361265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1361635Z _lazy_init(state, module) 2022-11-23T03:12:19.1362109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1362488Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1363174Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1363522Z return func(*args, **kwargs) 2022-11-23T03:12:19.1364007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1364351Z p_assert( 2022-11-23T03:12:19.1364778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1365115Z traceback.print_stack() 2022-11-23T03:12:19.1365367Z File "", line 1, in 2022-11-23T03:12:19.1365698Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1366133Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1366467Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1366802Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1367158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1367451Z self.run() 2022-11-23T03:12:19.1367751Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1368078Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1368544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1369076Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1369575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1369933Z getattr(self, test_name)() 2022-11-23T03:12:19.1370421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1370758Z fn() 2022-11-23T03:12:19.1371218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1371584Z test(self, **param_kwargs) 2022-11-23T03:12:19.1372225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1372578Z return func(*args, **kwargs) 2022-11-23T03:12:19.1372982Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1373545Z self.run_subtests( 2022-11-23T03:12:19.1374018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1374418Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1374938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1375337Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1375916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1376291Z output = model(*input) 2022-11-23T03:12:19.1376744Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1377104Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1377622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1378046Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1378580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1379101Z _lazy_init(state, module) 2022-11-23T03:12:19.1379555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1379918Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1380445Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1380790Z return func(*args, **kwargs) 2022-11-23T03:12:19.1381456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1381812Z p_assert( 2022-11-23T03:12:19.1382259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1382610Z traceback.print_stack() 2022-11-23T03:12:19.1382869Z File "", line 1, in 2022-11-23T03:12:19.1383212Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1383550Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1384099Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1384460Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1384833Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1385135Z self.run() 2022-11-23T03:12:19.1385444Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1385781Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1386267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1386627Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1387123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1387633Z getattr(self, test_name)() 2022-11-23T03:12:19.1388104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1388433Z fn() 2022-11-23T03:12:19.1388921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1389283Z test(self, **param_kwargs) 2022-11-23T03:12:19.1389752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1390283Z return func(*args, **kwargs) 2022-11-23T03:12:19.1390704Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1391091Z self.run_subtests( 2022-11-23T03:12:19.1391562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1391959Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1392475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1392871Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1393616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1394158Z output = model(*input) 2022-11-23T03:12:19.1394609Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1394972Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1395494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1395918Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1396454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1396821Z _lazy_init(state, module) 2022-11-23T03:12:19.1397293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1397911Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1398386Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1398907Z return func(*args, **kwargs) 2022-11-23T03:12:19.1399409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1399765Z p_assert( 2022-11-23T03:12:19.1400208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1400561Z traceback.print_stack() 2022-11-23T03:12:19.1400821Z File "", line 1, in 2022-11-23T03:12:19.1401164Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1401506Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1401849Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1402198Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1402566Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1402873Z self.run() 2022-11-23T03:12:19.1403182Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1403520Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1404001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1404363Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1404856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1405369Z getattr(self, test_name)() 2022-11-23T03:12:19.1405840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1406343Z fn() 2022-11-23T03:12:19.1406813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1407178Z test(self, **param_kwargs) 2022-11-23T03:12:19.1407664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1408025Z return func(*args, **kwargs) 2022-11-23T03:12:19.1408443Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1408831Z self.run_subtests( 2022-11-23T03:12:19.1409303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1409699Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1410215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1410614Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1411187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1411561Z output = model(*input) 2022-11-23T03:12:19.1412015Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1412374Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1412895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1413319Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1413856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1414224Z _lazy_init(state, module) 2022-11-23T03:12:19.1414696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1415132Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1415408Z File "", line 1, in 2022-11-23T03:12:19.1415876Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1416234Z return func(*args, **kwargs) 2022-11-23T03:12:19.1416745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1417106Z p_assert( 2022-11-23T03:12:19.1417417Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1417759Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1418243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1418748Z traceback.print_stack() 2022-11-23T03:12:19.1419071Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1419412Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1419762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1420052Z self.run() 2022-11-23T03:12:19.1420353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1420683Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1421146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1421494Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1421975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1422320Z getattr(self, test_name)() 2022-11-23T03:12:19.1422792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1423122Z fn() 2022-11-23T03:12:19.1423572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1424121Z test(self, **param_kwargs) 2022-11-23T03:12:19.1424603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1425133Z return func(*args, **kwargs) 2022-11-23T03:12:19.1425551Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1425941Z self.run_subtests( 2022-11-23T03:12:19.1426410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1426809Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1427326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1427798Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1428342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1428868Z output = model(*input) 2022-11-23T03:12:19.1429305Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1429653Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1430157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1430755Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1431293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1431667Z _lazy_init(state, module) 2022-11-23T03:12:19.1432260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1432647Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1433133Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1433492Z return func(*args, **kwargs) 2022-11-23T03:12:19.1434325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1434681Z p_assert( 2022-11-23T03:12:19.1435126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1435474Z traceback.print_stack() 2022-11-23T03:12:19.1435733Z File "", line 1, in 2022-11-23T03:12:19.1436078Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1436419Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1436772Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1437271Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1437627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1438104Z self.run() 2022-11-23T03:12:19.1438415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1438756Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1439242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1439604Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1440106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1440468Z getattr(self, test_name)() 2022-11-23T03:12:19.1440958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1441303Z fn() 2022-11-23T03:12:19.1441771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1442138Z test(self, **param_kwargs) 2022-11-23T03:12:19.1442624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1442987Z return func(*args, **kwargs) 2022-11-23T03:12:19.1443474Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1444018Z self.run_subtests( 2022-11-23T03:12:19.1444474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1444860Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1445356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1445977Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1446510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1446870Z output = model(*input) 2022-11-23T03:12:19.1447320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1447680Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1448197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1448618Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1449155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1449520Z _lazy_init(state, module) 2022-11-23T03:12:19.1450049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1450428Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1450911Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1451266Z return func(*args, **kwargs) 2022-11-23T03:12:19.1451765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1452120Z p_assert( 2022-11-23T03:12:19.1452560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1452907Z traceback.print_stack() 2022-11-23T03:12:19.1453164Z File "", line 1, in 2022-11-23T03:12:19.1453510Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1453855Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1454204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1454552Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1454916Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1455368Z self.run() 2022-11-23T03:12:19.1455845Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1456186Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1456666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1457036Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1457532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1457893Z getattr(self, test_name)() 2022-11-23T03:12:19.1458384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1458731Z fn() 2022-11-23T03:12:19.1459194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1459555Z test(self, **param_kwargs) 2022-11-23T03:12:19.1460037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1460400Z return func(*args, **kwargs) 2022-11-23T03:12:19.1460818Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1461358Z self.run_subtests( 2022-11-23T03:12:19.1461813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1462196Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1462744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1463134Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1463643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1464192Z output = model(*input) 2022-11-23T03:12:19.1464630Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1464980Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1465480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1465891Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1466408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1466843Z _lazy_init(state, module) 2022-11-23T03:12:19.1467306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1467675Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1468142Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1468484Z return func(*args, **kwargs) 2022-11-23T03:12:19.1468964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1469488Z p_assert( 2022-11-23T03:12:19.1469935Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1470288Z traceback.print_stack() 2022-11-23T03:12:19.1470547Z File "", line 1, in 2022-11-23T03:12:19.1470891Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1471241Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1471514Z File "", line 1, in 2022-11-23T03:12:19.1471853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1472364Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1472712Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1473011Z self.run() 2022-11-23T03:12:19.1473313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1473821Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1474166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1474505Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1474842Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1475185Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1475691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1476055Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1476398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1476706Z self.run() 2022-11-23T03:12:19.1477175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1477536Z getattr(self, test_name)() 2022-11-23T03:12:19.1477867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1478206Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1478704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1479260Z fn() 2022-11-23T03:12:19.1479870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1480298Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1480805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1481173Z test(self, **param_kwargs) 2022-11-23T03:12:19.1481657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1482015Z getattr(self, test_name)() 2022-11-23T03:12:19.1482648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1483002Z return func(*args, **kwargs) 2022-11-23T03:12:19.1483644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1483984Z fn() 2022-11-23T03:12:19.1484375Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1484823Z self.run_subtests( 2022-11-23T03:12:19.1485310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1485678Z test(self, **param_kwargs) 2022-11-23T03:12:19.1486312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1486688Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1487188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1487536Z return func(*args, **kwargs) 2022-11-23T03:12:19.1488009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1488386Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1488875Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1489259Z self.run_subtests( 2022-11-23T03:12:19.1489923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1490295Z output = model(*input) 2022-11-23T03:12:19.1490770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1491171Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1491648Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1492008Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1492511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1492903Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1493591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1494005Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1494723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1495087Z output = model(*input) 2022-11-23T03:12:19.1495577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1495944Z _lazy_init(state, module) 2022-11-23T03:12:19.1496385Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1496744Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1497236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1497823Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1498329Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1498742Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1499416Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1499766Z return func(*args, **kwargs) 2022-11-23T03:12:19.1500261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1500628Z _lazy_init(state, module) 2022-11-23T03:12:19.1501131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1501479Z p_assert( 2022-11-23T03:12:19.1501989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1502368Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1503003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1503346Z traceback.print_stack() 2022-11-23T03:12:19.1503797Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1504518Z return func(*args, **kwargs) 2022-11-23T03:12:19.1505027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1505382Z p_assert( 2022-11-23T03:12:19.1505828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1506177Z traceback.print_stack() 2022-11-23T03:12:19.1506435Z File "", line 1, in 2022-11-23T03:12:19.1506786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1507124Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1507465Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1507811Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1508175Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1508479Z self.run() 2022-11-23T03:12:19.1508789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1509130Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1509611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1517980Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1518571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1518940Z getattr(self, test_name)() 2022-11-23T03:12:19.1519427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1519765Z fn() 2022-11-23T03:12:19.1520208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1520569Z test(self, **param_kwargs) 2022-11-23T03:12:19.1521037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1521389Z return func(*args, **kwargs) 2022-11-23T03:12:19.1521796Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1522174Z self.run_subtests( 2022-11-23T03:12:19.1522631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1523141Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1523667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1524053Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1524561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1524914Z output = model(*input) 2022-11-23T03:12:19.1525353Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1525707Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1526392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1526826Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1527476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1527848Z _lazy_init(state, module) 2022-11-23T03:12:19.1528322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1528701Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1529347Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1529686Z return func(*args, **kwargs) 2022-11-23T03:12:19.1530173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1530517Z p_assert( 2022-11-23T03:12:19.1531135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1531484Z traceback.print_stack() 2022-11-23T03:12:19.1531748Z File "", line 1, in 2022-11-23T03:12:19.1532167Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1532506Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1532854Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1533195Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1533552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1533861Z self.run() 2022-11-23T03:12:19.1534497Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1534837Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1535326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1535687Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1536183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1536552Z getattr(self, test_name)() 2022-11-23T03:12:19.1537044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1537538Z fn() 2022-11-23T03:12:19.1537977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1538514Z test(self, **param_kwargs) 2022-11-23T03:12:19.1538996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1539359Z return func(*args, **kwargs) 2022-11-23T03:12:19.1539778Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1540167Z self.run_subtests( 2022-11-23T03:12:19.1540645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1541096Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1541631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1542033Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1542559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1542924Z output = model(*input) 2022-11-23T03:12:19.1543534Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1544172Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1544888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1545407Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1545948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1546317Z _lazy_init(state, module) 2022-11-23T03:12:19.1546785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1547162Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1547649Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1547762Z return func(*args, **kwargs) 2022-11-23T03:12:19.1548120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1548213Z p_assert( 2022-11-23T03:12:19.1548537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1548658Z traceback.print_stack() 2022-11-23T03:12:19.1548779Z File "", line 1, in 2022-11-23T03:12:19.1548979Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1549110Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1549300Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1549431Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1549632Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1549723Z self.run() 2022-11-23T03:12:19.1549920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1550057Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1550385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1550505Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1550860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1550966Z getattr(self, test_name)() 2022-11-23T03:12:19.1551316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1551402Z fn() 2022-11-23T03:12:19.1551753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1551863Z test(self, **param_kwargs) 2022-11-23T03:12:19.1552204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1552316Z return func(*args, **kwargs) 2022-11-23T03:12:19.1552593Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1552861Z self.run_subtests( 2022-11-23T03:12:19.1553260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1553412Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1553751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1553889Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1554239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1554342Z output = model(*input) 2022-11-23T03:12:19.1554644Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1554763Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1555113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1555321Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1555663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1555768Z _lazy_init(state, module) 2022-11-23T03:12:19.1556091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1556218Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1556532Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1556635Z return func(*args, **kwargs) 2022-11-23T03:12:19.1556749Z File "", line 1, in 2022-11-23T03:12:19.1557102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1557190Z p_assert( 2022-11-23T03:12:19.1557388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1557518Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1558014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1558122Z traceback.print_stack() 2022-11-23T03:12:19.1558312Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1558453Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1558653Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1558745Z self.run() 2022-11-23T03:12:19.1558936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1559068Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1559394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1559514Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1559864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1559979Z getattr(self, test_name)() 2022-11-23T03:12:19.1560327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1560413Z fn() 2022-11-23T03:12:19.1560762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1560874Z test(self, **param_kwargs) 2022-11-23T03:12:19.1561379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1561482Z return func(*args, **kwargs) 2022-11-23T03:12:19.1561756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1561859Z self.run_subtests( 2022-11-23T03:12:19.1562233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1562383Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1562719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1562862Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1563208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1563306Z output = model(*input) 2022-11-23T03:12:19.1563605Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1563732Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1564080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1564291Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1564633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1564738Z _lazy_init(state, module) 2022-11-23T03:12:19.1565065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1565184Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1565495Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1565608Z return func(*args, **kwargs) 2022-11-23T03:12:19.1565958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1566044Z p_assert( 2022-11-23T03:12:19.1566356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1566474Z traceback.print_stack() 2022-11-23T03:12:19.1566587Z File "", line 1, in 2022-11-23T03:12:19.1566771Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1566896Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1567079Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1567213Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1567406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1567495Z self.run() 2022-11-23T03:12:19.1567681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1567804Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1568121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1568241Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1568759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1568871Z getattr(self, test_name)() 2022-11-23T03:12:19.1569213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1569300Z fn() 2022-11-23T03:12:19.1569650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1569753Z test(self, **param_kwargs) 2022-11-23T03:12:19.1570096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1570209Z return func(*args, **kwargs) 2022-11-23T03:12:19.1570493Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1570643Z self.run_subtests( 2022-11-23T03:12:19.1570988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1571140Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1571492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1571625Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1571988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1572097Z output = model(*input) 2022-11-23T03:12:19.1572410Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1572538Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1572955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1573119Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1573472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1573573Z _lazy_init(state, module) 2022-11-23T03:12:19.1573912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1574042Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1574367Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1574479Z return func(*args, **kwargs) 2022-11-23T03:12:19.1574843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1574941Z p_assert( 2022-11-23T03:12:19.1575265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1575372Z traceback.print_stack() 2022-11-23T03:12:19.1575489Z File "", line 1, in 2022-11-23T03:12:19.1575683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1575812Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1576001Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1576139Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1576338Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1576430Z self.run() 2022-11-23T03:12:19.1576612Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1576744Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1577073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1577202Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1577551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1577663Z getattr(self, test_name)() 2022-11-23T03:12:19.1578005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1578084Z fn() 2022-11-23T03:12:19.1578438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1578549Z test(self, **param_kwargs) 2022-11-23T03:12:19.1578891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1579003Z return func(*args, **kwargs) 2022-11-23T03:12:19.1579372Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1579487Z self.run_subtests( 2022-11-23T03:12:19.1579832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1579985Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1580327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1580469Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1580829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1580936Z output = model(*input) 2022-11-23T03:12:19.1581052Z File "", line 1, in 2022-11-23T03:12:19.1581363Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1581540Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1581903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1582069Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1582269Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1582401Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1582755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1582863Z _lazy_init(state, module) 2022-11-23T03:12:19.1583052Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1583191Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1583520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1583653Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1583773Z File "", line 1, in 2022-11-23T03:12:19.1584198Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1584303Z self.run() 2022-11-23T03:12:19.1584637Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1584750Z return func(*args, **kwargs) 2022-11-23T03:12:19.1584947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1585070Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1585264Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1585398Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1585761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1585857Z p_assert( 2022-11-23T03:12:19.1586050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1586189Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1586512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1586627Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1586947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1587061Z traceback.print_stack() 2022-11-23T03:12:19.1587259Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1587350Z self.run() 2022-11-23T03:12:19.1587698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1587810Z getattr(self, test_name)() 2022-11-23T03:12:19.1587997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1588218Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1588741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1588825Z fn() 2022-11-23T03:12:19.1589191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1589310Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1589651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1589759Z test(self, **param_kwargs) 2022-11-23T03:12:19.1590084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1590191Z getattr(self, test_name)() 2022-11-23T03:12:19.1590525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1590888Z return func(*args, **kwargs) 2022-11-23T03:12:19.1591237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1591324Z fn() 2022-11-23T03:12:19.1591607Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1591711Z self.run_subtests( 2022-11-23T03:12:19.1592055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1592168Z test(self, **param_kwargs) 2022-11-23T03:12:19.1592507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1592658Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1593009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1593123Z return func(*args, **kwargs) 2022-11-23T03:12:19.1593468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1593607Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1594028Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1594128Z self.run_subtests( 2022-11-23T03:12:19.1594479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1594582Z output = model(*input) 2022-11-23T03:12:19.1595098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1595252Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1595571Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1595702Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1596043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1596184Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1596547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1596710Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1597070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1597179Z output = model(*input) 2022-11-23T03:12:19.1597533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1597694Z _lazy_init(state, module) 2022-11-23T03:12:19.1598160Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1598286Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1598614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1598742Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1599089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1599247Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1599741Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1599855Z return func(*args, **kwargs) 2022-11-23T03:12:19.1600280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1600382Z _lazy_init(state, module) 2022-11-23T03:12:19.1600746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1600839Z p_assert( 2022-11-23T03:12:19.1601173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1601304Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1601628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1601742Z traceback.print_stack() 2022-11-23T03:12:19.1602057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1602170Z return func(*args, **kwargs) 2022-11-23T03:12:19.1602538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1602633Z p_assert( 2022-11-23T03:12:19.1602955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1603069Z traceback.print_stack() 2022-11-23T03:12:19.1603344Z File "", line 1, in 2022-11-23T03:12:19.1603535Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1603654Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1603837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1603973Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1604166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1604255Z self.run() 2022-11-23T03:12:19.1604440Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1604573Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1604889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1605002Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1605337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1605445Z getattr(self, test_name)() 2022-11-23T03:12:19.1605776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1605858Z fn() 2022-11-23T03:12:19.1606201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1606308Z test(self, **param_kwargs) 2022-11-23T03:12:19.1606635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1606747Z return func(*args, **kwargs) 2022-11-23T03:12:19.1607248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1607357Z self.run_subtests( 2022-11-23T03:12:19.1607698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1607847Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1608195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1608335Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1608696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1608797Z output = model(*input) 2022-11-23T03:12:19.1609160Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1609292Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1609653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1609814Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1610166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1610274Z _lazy_init(state, module) 2022-11-23T03:12:19.1610612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1610736Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1611218Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1611326Z return func(*args, **kwargs) 2022-11-23T03:12:19.1611685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1611772Z p_assert( 2022-11-23T03:12:19.1612083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1612193Z traceback.print_stack() 2022-11-23T03:12:19.1612300Z File "", line 1, in 2022-11-23T03:12:19.1612412Z File "", line 1, in 2022-11-23T03:12:19.1612781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1612912Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1613102Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1613240Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1613436Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1613568Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1613767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1613859Z self.run() 2022-11-23T03:12:19.1614048Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1614184Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1614372Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1614506Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1614705Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1614789Z self.run() 2022-11-23T03:12:19.1615123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1615245Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1615435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1615774Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1616120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1616228Z getattr(self, test_name)() 2022-11-23T03:12:19.1616541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1616650Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1616985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1617069Z fn() 2022-11-23T03:12:19.1617401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1617507Z getattr(self, test_name)() 2022-11-23T03:12:19.1617847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1618002Z test(self, **param_kwargs) 2022-11-23T03:12:19.1618338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1618414Z fn() 2022-11-23T03:12:19.1618746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1618856Z return func(*args, **kwargs) 2022-11-23T03:12:19.1619190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1619296Z test(self, **param_kwargs) 2022-11-23T03:12:19.1619569Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1619672Z self.run_subtests( 2022-11-23T03:12:19.1620003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1620111Z return func(*args, **kwargs) 2022-11-23T03:12:19.1620438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1620583Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1620852Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1620950Z self.run_subtests( 2022-11-23T03:12:19.1621288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1621424Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1621749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1621885Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1622239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1622346Z output = model(*input) 2022-11-23T03:12:19.1622860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1623002Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1623312Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1623439Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1623801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1624309Z output = model(*input) 2022-11-23T03:12:19.1624694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1624864Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1625246Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1625381Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1625740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1625849Z _lazy_init(state, module) 2022-11-23T03:12:19.1626210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1626373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1626704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1626834Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1627181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1627353Z _lazy_init(state, module) 2022-11-23T03:12:19.1627681Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1627793Z return func(*args, **kwargs) 2022-11-23T03:12:19.1628125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1628254Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1628613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1628705Z p_assert( 2022-11-23T03:12:19.1629028Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1629139Z return func(*args, **kwargs) 2022-11-23T03:12:19.1629615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1629733Z traceback.print_stack() 2022-11-23T03:12:19.1630089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1630170Z p_assert( 2022-11-23T03:12:19.1630481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1630589Z traceback.print_stack() 2022-11-23T03:12:19.1630701Z File "", line 1, in 2022-11-23T03:12:19.1630888Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1631013Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1631380Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1631519Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1631712Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1631809Z self.run() 2022-11-23T03:12:19.1632058Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1632197Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1632524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1632646Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1632995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1633107Z getattr(self, test_name)() 2022-11-23T03:12:19.1633449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1633535Z fn() 2022-11-23T03:12:19.1633888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1634003Z test(self, **param_kwargs) 2022-11-23T03:12:19.1634555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1634672Z return func(*args, **kwargs) 2022-11-23T03:12:19.1635124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1635225Z self.run_subtests( 2022-11-23T03:12:19.1635558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1635712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1636065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1636206Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1636568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1636726Z output = model(*input) 2022-11-23T03:12:19.1637046Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1637174Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1637530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1637849Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1638193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1638298Z _lazy_init(state, module) 2022-11-23T03:12:19.1638802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1638934Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1639261Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1639377Z return func(*args, **kwargs) 2022-11-23T03:12:19.1639735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1639826Z p_assert( 2022-11-23T03:12:19.1640149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1640264Z traceback.print_stack() 2022-11-23T03:12:19.1640382Z File "", line 1, in 2022-11-23T03:12:19.1640580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1640712Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1640897Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1641035Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1641239Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1641334Z self.run() 2022-11-23T03:12:19.1641523Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1641656Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1641983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1642104Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1642448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1642562Z getattr(self, test_name)() 2022-11-23T03:12:19.1642909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1642995Z fn() 2022-11-23T03:12:19.1643344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1643459Z test(self, **param_kwargs) 2022-11-23T03:12:19.1643847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1643966Z return func(*args, **kwargs) 2022-11-23T03:12:19.1644248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1644350Z self.run_subtests( 2022-11-23T03:12:19.1644693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1644842Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1645192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1645334Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1645699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1645856Z output = model(*input) 2022-11-23T03:12:19.1646165Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1646294Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1646656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1646821Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1647175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1647284Z _lazy_init(state, module) 2022-11-23T03:12:19.1647620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1647750Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1648075Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1648189Z return func(*args, **kwargs) 2022-11-23T03:12:19.1648552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1648643Z p_assert( 2022-11-23T03:12:19.1648964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1649078Z traceback.print_stack() 2022-11-23T03:12:19.1649195Z File "", line 1, in 2022-11-23T03:12:19.1649392Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1649515Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1649703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1649841Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1650053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1650147Z self.run() 2022-11-23T03:12:19.1650337Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1650473Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1650794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1650921Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1651269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1651384Z getattr(self, test_name)() 2022-11-23T03:12:19.1651728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1651817Z fn() 2022-11-23T03:12:19.1652165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1652326Z test(self, **param_kwargs) 2022-11-23T03:12:19.1652678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1652792Z return func(*args, **kwargs) 2022-11-23T03:12:19.1653238Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1653337Z self.run_subtests( 2022-11-23T03:12:19.1653663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1653807Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1654148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1654285Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1654864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1654975Z output = model(*input) 2022-11-23T03:12:19.1655290Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1655418Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1655783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1655948Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1656300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1656409Z _lazy_init(state, module) 2022-11-23T03:12:19.1656747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1656875Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1657201Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1657313Z return func(*args, **kwargs) 2022-11-23T03:12:19.1657678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1657772Z p_assert( 2022-11-23T03:12:19.1657890Z File "", line 1, in 2022-11-23T03:12:19.1658210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1658318Z traceback.print_stack() 2022-11-23T03:12:19.1658517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1658648Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1658838Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1658980Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1659182Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1659276Z self.run() 2022-11-23T03:12:19.1659465Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1659591Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1659915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1660035Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1660381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1660492Z getattr(self, test_name)() 2022-11-23T03:12:19.1660838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1660925Z fn() 2022-11-23T03:12:19.1661039Z File "", line 1, in 2022-11-23T03:12:19.1661441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1661559Z test(self, **param_kwargs) 2022-11-23T03:12:19.1661901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1662013Z return func(*args, **kwargs) 2022-11-23T03:12:19.1662208Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1662338Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1662619Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1662715Z self.run_subtests( 2022-11-23T03:12:19.1663065Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1663261Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1663593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1663739Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1664128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1664228Z self.run() 2022-11-23T03:12:19.1664575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1664703Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1664888Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1665019Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1665368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1665476Z output = model(*input) 2022-11-23T03:12:19.1665794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1666003Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1666310Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1666429Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1666764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1666871Z getattr(self, test_name)() 2022-11-23T03:12:19.1667219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1667376Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1667707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1667796Z fn() 2022-11-23T03:12:19.1668139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1668239Z _lazy_init(state, module) 2022-11-23T03:12:19.1668576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1668683Z test(self, **param_kwargs) 2022-11-23T03:12:19.1669004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1669130Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1669460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1669569Z return func(*args, **kwargs) 2022-11-23T03:12:19.1670062Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1670174Z return func(*args, **kwargs) 2022-11-23T03:12:19.1670531Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1670644Z self.run_subtests( 2022-11-23T03:12:19.1671012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1671102Z p_assert( 2022-11-23T03:12:19.1671441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1671593Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1671915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1672021Z traceback.print_stack() 2022-11-23T03:12:19.1672371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1672579Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1673100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1673205Z output = model(*input) 2022-11-23T03:12:19.1673504Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1673629Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1673976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1674127Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1674655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1674765Z _lazy_init(state, module) 2022-11-23T03:12:19.1675098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1675236Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1675561Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1675674Z return func(*args, **kwargs) 2022-11-23T03:12:19.1676038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1676122Z p_assert( 2022-11-23T03:12:19.1676444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1676559Z traceback.print_stack() 2022-11-23T03:12:19.1676677Z File "", line 1, in 2022-11-23T03:12:19.1676876Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1677005Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1677200Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1677343Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1677536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1677627Z self.run() 2022-11-23T03:12:19.1677817Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1677952Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1678280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1678401Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1678750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1678856Z getattr(self, test_name)() 2022-11-23T03:12:19.1679200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1679290Z fn() 2022-11-23T03:12:19.1679849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1679967Z test(self, **param_kwargs) 2022-11-23T03:12:19.1680300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1680409Z return func(*args, **kwargs) 2022-11-23T03:12:19.1680683Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1680775Z self.run_subtests( 2022-11-23T03:12:19.1681101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1681246Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1681763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1681955Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1682318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1682427Z output = model(*input) 2022-11-23T03:12:19.1682739Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1682860Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1683224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1683388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1683739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1683847Z _lazy_init(state, module) 2022-11-23T03:12:19.1684194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1684324Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1684805Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1684916Z return func(*args, **kwargs) 2022-11-23T03:12:19.1685262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1685350Z p_assert( 2022-11-23T03:12:19.1685663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1685774Z traceback.print_stack() 2022-11-23T03:12:19.1685886Z File "", line 1, in 2022-11-23T03:12:19.1686076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1686205Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1686384Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1686517Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1686708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1686798Z self.run() 2022-11-23T03:12:19.1686980Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1687109Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1687421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1687537Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1687867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1687974Z getattr(self, test_name)() 2022-11-23T03:12:19.1688303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1688435Z fn() 2022-11-23T03:12:19.1688784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1688891Z test(self, **param_kwargs) 2022-11-23T03:12:19.1689276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1689385Z return func(*args, **kwargs) 2022-11-23T03:12:19.1689652Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1689752Z self.run_subtests( 2022-11-23T03:12:19.1690075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1690221Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1690611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1690749Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1691282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1691390Z output = model(*input) 2022-11-23T03:12:19.1691696Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1691823Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1692182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1692345Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1692698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1692812Z _lazy_init(state, module) 2022-11-23T03:12:19.1693154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1693287Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1693604Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1693718Z return func(*args, **kwargs) 2022-11-23T03:12:19.1694243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1694332Z p_assert( 2022-11-23T03:12:19.1694643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1694758Z traceback.print_stack() 2022-11-23T03:12:19.1694872Z File "", line 1, in 2022-11-23T03:12:19.1695239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1695368Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1695562Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1695700Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1695899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1695989Z self.run() 2022-11-23T03:12:19.1696180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1696313Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1696633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1696753Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1697102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1697214Z getattr(self, test_name)() 2022-11-23T03:12:19.1697613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1697706Z fn() 2022-11-23T03:12:19.1698060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1698327Z test(self, **param_kwargs) 2022-11-23T03:12:19.1698654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1698764Z return func(*args, **kwargs) 2022-11-23T03:12:19.1699036Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1699133Z self.run_subtests( 2022-11-23T03:12:19.1699459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1699606Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1700179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1700320Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1700675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1700783Z output = model(*input) 2022-11-23T03:12:19.1701099Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1701227Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1701588Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1701750Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1702101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1702217Z _lazy_init(state, module) 2022-11-23T03:12:19.1702548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1702679Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1703002Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1703115Z return func(*args, **kwargs) 2022-11-23T03:12:19.1703479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1703570Z p_assert( 2022-11-23T03:12:19.1704098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1704227Z traceback.print_stack() 2022-11-23T03:12:19.1704338Z File "", line 1, in 2022-11-23T03:12:19.1704533Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1704674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1704867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1705007Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1705206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1705297Z self.run() 2022-11-23T03:12:19.1705480Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1705614Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1705949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1706069Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1706415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1706531Z getattr(self, test_name)() 2022-11-23T03:12:19.1706945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1707039Z fn() 2022-11-23T03:12:19.1707389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1707500Z test(self, **param_kwargs) 2022-11-23T03:12:19.1707841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1707955Z return func(*args, **kwargs) 2022-11-23T03:12:19.1708240Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1708343Z self.run_subtests( 2022-11-23T03:12:19.1708683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1708893Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1709237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1709381Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1709739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1709849Z output = model(*input) 2022-11-23T03:12:19.1710161Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1710289Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1710649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1710812Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1711164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1711273Z _lazy_init(state, module) 2022-11-23T03:12:19.1711612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1711744Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1712066Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1712178Z return func(*args, **kwargs) 2022-11-23T03:12:19.1712703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1712790Z p_assert( 2022-11-23T03:12:19.1713285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1713394Z traceback.print_stack() 2022-11-23T03:12:19.1713510Z File "", line 1, in 2022-11-23T03:12:19.1713714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1713843Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1714031Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1714170Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1714370Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1714457Z self.run() 2022-11-23T03:12:19.1714646Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1714779Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1715104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1715226Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1715573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1715687Z getattr(self, test_name)() 2022-11-23T03:12:19.1716075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1716161Z fn() 2022-11-23T03:12:19.1716518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1716628Z test(self, **param_kwargs) 2022-11-23T03:12:19.1716969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1717084Z return func(*args, **kwargs) 2022-11-23T03:12:19.1717365Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1717467Z self.run_subtests( 2022-11-23T03:12:19.1717805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1718033Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1718388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1718529Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1719052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1719156Z output = model(*input) 2022-11-23T03:12:19.1719461Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1719584Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1719932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1720084Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1720434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1720540Z _lazy_init(state, module) 2022-11-23T03:12:19.1720868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1720994Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1721306Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1721416Z return func(*args, **kwargs) 2022-11-23T03:12:19.1721769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1721851Z p_assert( 2022-11-23T03:12:19.1722162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1722272Z traceback.print_stack() 2022-11-23T03:12:19.1722388Z File "", line 1, in 2022-11-23T03:12:19.1722581Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1722708Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1722894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1723021Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1723214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1723303Z self.run() 2022-11-23T03:12:19.1723487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1723615Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1723931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1724051Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1724386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1724535Z getattr(self, test_name)() 2022-11-23T03:12:19.1724655Z File "", line 1, in 2022-11-23T03:12:19.1724991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1725074Z fn() 2022-11-23T03:12:19.1725599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1725710Z test(self, **param_kwargs) 2022-11-23T03:12:19.1725907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1726038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1726378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1726491Z return func(*args, **kwargs) 2022-11-23T03:12:19.1726739Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1726882Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1727170Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1727272Z self.run_subtests( 2022-11-23T03:12:19.1727472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1727564Z self.run() 2022-11-23T03:12:19.1727897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1728046Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1728237Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1728368Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1728717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1728866Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1729189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1729303Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1729825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1729928Z output = model(*input) 2022-11-23T03:12:19.1730263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1730371Z getattr(self, test_name)() 2022-11-23T03:12:19.1730671Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1730797Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1731134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1731212Z fn() 2022-11-23T03:12:19.1731746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1731923Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1732312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1732424Z test(self, **param_kwargs) 2022-11-23T03:12:19.1732774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1732883Z _lazy_init(state, module) 2022-11-23T03:12:19.1733228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1733333Z return func(*args, **kwargs) 2022-11-23T03:12:19.1733717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1733856Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1734138Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1734240Z self.run_subtests( 2022-11-23T03:12:19.1734568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1734841Z return func(*args, **kwargs) 2022-11-23T03:12:19.1735348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1735499Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1735859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1735998Z p_assert( 2022-11-23T03:12:19.1736351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1736493Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1736816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1736930Z traceback.print_stack() 2022-11-23T03:12:19.1737292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1737398Z output = model(*input) 2022-11-23T03:12:19.1737700Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1737829Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1738342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1738504Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1738849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1739134Z _lazy_init(state, module) 2022-11-23T03:12:19.1739468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1739598Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1739915Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1740028Z return func(*args, **kwargs) 2022-11-23T03:12:19.1740393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1740483Z p_assert( 2022-11-23T03:12:19.1740804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1740923Z traceback.print_stack() 2022-11-23T03:12:19.1741043Z File "", line 1, in 2022-11-23T03:12:19.1741232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1741361Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1741550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1741689Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1741887Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1741978Z self.run() 2022-11-23T03:12:19.1742168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1742301Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1742620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1742745Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1743143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1743321Z getattr(self, test_name)() 2022-11-23T03:12:19.1743688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1743774Z fn() 2022-11-23T03:12:19.1744505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1744613Z test(self, **param_kwargs) 2022-11-23T03:12:19.1744942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1745052Z return func(*args, **kwargs) 2022-11-23T03:12:19.1745325Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1745503Z self.run_subtests( 2022-11-23T03:12:19.1745832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1745977Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1746480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1746622Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1746975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1747083Z output = model(*input) 2022-11-23T03:12:19.1747396Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1747525Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1747883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1748052Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1748404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1748513Z _lazy_init(state, module) 2022-11-23T03:12:19.1748844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1748975Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1749297Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1749409Z return func(*args, **kwargs) 2022-11-23T03:12:19.1749773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1749864Z p_assert( 2022-11-23T03:12:19.1750206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1750328Z traceback.print_stack() 2022-11-23T03:12:19.1750443Z File "", line 1, in 2022-11-23T03:12:19.1750639Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1750769Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1750958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1751096Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1751296Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1751388Z self.run() 2022-11-23T03:12:19.1751570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1751705Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1752031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1752156Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1752560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1752679Z getattr(self, test_name)() 2022-11-23T03:12:19.1753028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1753114Z fn() 2022-11-23T03:12:19.1753457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1753569Z test(self, **param_kwargs) 2022-11-23T03:12:19.1753912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1754024Z return func(*args, **kwargs) 2022-11-23T03:12:19.1754308Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1754461Z self.run_subtests( 2022-11-23T03:12:19.1754803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1754954Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1755458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1755595Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1755946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1756048Z output = model(*input) 2022-11-23T03:12:19.1756347Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1756472Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1756823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1756984Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1757318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1757422Z _lazy_init(state, module) 2022-11-23T03:12:19.1757747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1757872Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1758186Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1758299Z return func(*args, **kwargs) 2022-11-23T03:12:19.1758831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1758922Z p_assert( 2022-11-23T03:12:19.1759246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1759360Z traceback.print_stack() 2022-11-23T03:12:19.1759476Z File "", line 1, in 2022-11-23T03:12:19.1759672Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1759803Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1759990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1760129Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1760328Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1760416Z self.run() 2022-11-23T03:12:19.1760606Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1760740Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1761067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1761239Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1761595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1761705Z getattr(self, test_name)() 2022-11-23T03:12:19.1762196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1762281Z fn() 2022-11-23T03:12:19.1762618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1762724Z test(self, **param_kwargs) 2022-11-23T03:12:19.1763055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1763165Z return func(*args, **kwargs) 2022-11-23T03:12:19.1763443Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1763605Z self.run_subtests( 2022-11-23T03:12:19.1763927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1764076Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1764414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1764550Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1764900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1765004Z output = model(*input) 2022-11-23T03:12:19.1765306Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1765430Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1765789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1765943Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1766280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1766386Z _lazy_init(state, module) 2022-11-23T03:12:19.1766710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1766835Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1767146Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1767255Z return func(*args, **kwargs) 2022-11-23T03:12:19.1767604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1767691Z p_assert( 2022-11-23T03:12:19.1768005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1768118Z traceback.print_stack() 2022-11-23T03:12:19.1768231Z File "", line 1, in 2022-11-23T03:12:19.1768420Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1768544Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1768725Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1768853Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1769046Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1769135Z self.run() 2022-11-23T03:12:19.1769318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1769447Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1769811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1769935Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1770448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1770560Z getattr(self, test_name)() 2022-11-23T03:12:19.1770903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1770982Z fn() 2022-11-23T03:12:19.1771333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1771443Z test(self, **param_kwargs) 2022-11-23T03:12:19.1771784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1771896Z return func(*args, **kwargs) 2022-11-23T03:12:19.1772229Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1772331Z self.run_subtests( 2022-11-23T03:12:19.1772670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1772812Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1773162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1773461Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1773577Z File "", line 1, in 2022-11-23T03:12:19.1773923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1774026Z output = model(*input) 2022-11-23T03:12:19.1774216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1774349Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1774828Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1774958Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1775150Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1775288Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1775651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1775817Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1776016Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1776102Z self.run() 2022-11-23T03:12:19.1776456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1776569Z _lazy_init(state, module) 2022-11-23T03:12:19.1776763Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1776898Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1777240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1777370Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1777693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1777808Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1778133Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1778246Z return func(*args, **kwargs) 2022-11-23T03:12:19.1778590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1778706Z getattr(self, test_name)() 2022-11-23T03:12:19.1779112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1779208Z p_assert( 2022-11-23T03:12:19.1779551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1779631Z fn() 2022-11-23T03:12:19.1779953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1780066Z traceback.print_stack() 2022-11-23T03:12:19.1780415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1780525Z test(self, **param_kwargs) 2022-11-23T03:12:19.1780862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1781024Z return func(*args, **kwargs) 2022-11-23T03:12:19.1781315Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1781411Z self.run_subtests( 2022-11-23T03:12:19.1781749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1781896Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1782245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1782384Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1782895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1782997Z output = model(*input) 2022-11-23T03:12:19.1783297Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1783423Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1784162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1784337Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1784693Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1784802Z _lazy_init(state, module) 2022-11-23T03:12:19.1785137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1785267Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1785590Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1785696Z return func(*args, **kwargs) 2022-11-23T03:12:19.1786065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1786160Z p_assert( 2022-11-23T03:12:19.1786644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1786756Z traceback.print_stack() 2022-11-23T03:12:19.1786871Z File "", line 1, in 2022-11-23T03:12:19.1787060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1787185Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1787363Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1787497Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1787692Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1787781Z self.run() 2022-11-23T03:12:19.1787965Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1788100Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1788480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1788599Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1788982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1789092Z getattr(self, test_name)() 2022-11-23T03:12:19.1789426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1789510Z fn() 2022-11-23T03:12:19.1790026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1790137Z test(self, **param_kwargs) 2022-11-23T03:12:19.1790478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1790666Z return func(*args, **kwargs) 2022-11-23T03:12:19.1790955Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1791057Z self.run_subtests( 2022-11-23T03:12:19.1791398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1791547Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1791896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1792037Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1792399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1792505Z output = model(*input) 2022-11-23T03:12:19.1792814Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1792952Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1793316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1793480Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1793831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1793940Z _lazy_init(state, module) 2022-11-23T03:12:19.1794055Z File "", line 1, in 2022-11-23T03:12:19.1794391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1794673Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1794984Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1795096Z return func(*args, **kwargs) 2022-11-23T03:12:19.1795290Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1795416Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1795957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1796048Z p_assert( 2022-11-23T03:12:19.1796233Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1796372Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1796692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1796805Z traceback.print_stack() 2022-11-23T03:12:19.1797006Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1797096Z self.run() 2022-11-23T03:12:19.1797289Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1797472Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1797803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1797927Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1798275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1798385Z getattr(self, test_name)() 2022-11-23T03:12:19.1798732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1798817Z fn() 2022-11-23T03:12:19.1799323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1799435Z test(self, **param_kwargs) 2022-11-23T03:12:19.1799758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1799919Z return func(*args, **kwargs) 2022-11-23T03:12:19.1800194Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1800472Z self.run_subtests( 2022-11-23T03:12:19.1800812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1800964Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1801315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1801456Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1801812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1801920Z output = model(*input) 2022-11-23T03:12:19.1802243Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1802372Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1802733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1802897Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1803246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1803355Z _lazy_init(state, module) 2022-11-23T03:12:19.1803687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1803974Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1804289Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1804400Z return func(*args, **kwargs) 2022-11-23T03:12:19.1804934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1805027Z p_assert( 2022-11-23T03:12:19.1805352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1805469Z traceback.print_stack() 2022-11-23T03:12:19.1805582Z File "", line 1, in 2022-11-23T03:12:19.1805782Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1805911Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1806101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1806238Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1806438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1806533Z self.run() 2022-11-23T03:12:19.1806767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1806908Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1807234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1807355Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1807701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1807813Z getattr(self, test_name)() 2022-11-23T03:12:19.1808154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1808241Z fn() 2022-11-23T03:12:19.1808582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1808692Z test(self, **param_kwargs) 2022-11-23T03:12:19.1809087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1809198Z return func(*args, **kwargs) 2022-11-23T03:12:19.1809480Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1809582Z self.run_subtests( 2022-11-23T03:12:19.1809919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1810070Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1810417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1810558Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1810921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1811033Z output = model(*input) 2022-11-23T03:12:19.1811511Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1811636Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1811988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1812146Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1812478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1812584Z _lazy_init(state, module) 2022-11-23T03:12:19.1812909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1813034Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1813521Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1813641Z return func(*args, **kwargs) 2022-11-23T03:12:19.1814003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1814093Z p_assert( 2022-11-23T03:12:19.1814409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1814522Z traceback.print_stack() 2022-11-23T03:12:19.1814639Z File "", line 1, in 2022-11-23T03:12:19.1814834Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1814964Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1815153Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1815291Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1815491Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1815581Z self.run() 2022-11-23T03:12:19.1815820Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1815959Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1816448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1816565Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1816900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1817007Z getattr(self, test_name)() 2022-11-23T03:12:19.1817341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1817418Z fn() 2022-11-23T03:12:19.1817752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1817906Z test(self, **param_kwargs) 2022-11-23T03:12:19.1818240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1818349Z return func(*args, **kwargs) 2022-11-23T03:12:19.1818622Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1818720Z self.run_subtests( 2022-11-23T03:12:19.1819039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1819183Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1819520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1819655Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1820003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1820113Z output = model(*input) 2022-11-23T03:12:19.1820415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1820540Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1820888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1821041Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1821383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1821488Z _lazy_init(state, module) 2022-11-23T03:12:19.1821813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1821938Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1822255Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1822368Z return func(*args, **kwargs) 2022-11-23T03:12:19.1822724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1822806Z p_assert( 2022-11-23T03:12:19.1823118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1823229Z traceback.print_stack() 2022-11-23T03:12:19.1823341Z File "", line 1, in 2022-11-23T03:12:19.1823529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1823654Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1823836Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1824178Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1824378Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1824533Z self.run() 2022-11-23T03:12:19.1824725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1824854Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1825177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1825293Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1825626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1825728Z getattr(self, test_name)() 2022-11-23T03:12:19.1826060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1826145Z fn() 2022-11-23T03:12:19.1826482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1826654Z test(self, **param_kwargs) 2022-11-23T03:12:19.1826989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1827098Z return func(*args, **kwargs) 2022-11-23T03:12:19.1827552Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1827649Z self.run_subtests( 2022-11-23T03:12:19.1827989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1828139Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1828489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1828630Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1828996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1829109Z output = model(*input) 2022-11-23T03:12:19.1829421Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1829544Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1829906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1830228Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1830571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1830676Z _lazy_init(state, module) 2022-11-23T03:12:19.1831002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1831130Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1831451Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1831554Z return func(*args, **kwargs) 2022-11-23T03:12:19.1832133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1832226Z p_assert( 2022-11-23T03:12:19.1832550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1832663Z traceback.print_stack() 2022-11-23T03:12:19.1832779Z File "", line 1, in 2022-11-23T03:12:19.1832974Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1833104Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1833287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1833425Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1833677Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1833777Z self.run() 2022-11-23T03:12:19.1833965Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1834097Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1834422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1834537Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1834883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1834994Z getattr(self, test_name)() 2022-11-23T03:12:19.1835659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1835746Z fn() 2022-11-23T03:12:19.1836096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1836258Z test(self, **param_kwargs) 2022-11-23T03:12:19.1836601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1836705Z return func(*args, **kwargs) 2022-11-23T03:12:19.1836987Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1837088Z self.run_subtests( 2022-11-23T03:12:19.1837426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1837576Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1837928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1838069Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1838598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1838696Z output = model(*input) 2022-11-23T03:12:19.1839000Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1839123Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1839654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1839818Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1840166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1840274Z _lazy_init(state, module) 2022-11-23T03:12:19.1840611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1840740Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1841068Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1841183Z return func(*args, **kwargs) 2022-11-23T03:12:19.1841546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1841638Z p_assert( 2022-11-23T03:12:19.1841961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1842074Z traceback.print_stack() 2022-11-23T03:12:19.1842190Z File "", line 1, in 2022-11-23T03:12:19.1842382Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1842512Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1842700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1842843Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1843095Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1843195Z self.run() 2022-11-23T03:12:19.1843311Z File "", line 1, in 2022-11-23T03:12:19.1843650Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1843780Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1844097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1844213Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1844402Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1844526Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1844857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1845010Z getattr(self, test_name)() 2022-11-23T03:12:19.1845190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1845324Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1845846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1845933Z fn() 2022-11-23T03:12:19.1846132Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1846224Z self.run() 2022-11-23T03:12:19.1846575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1846686Z test(self, **param_kwargs) 2022-11-23T03:12:19.1846870Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1847005Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1847350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1847472Z return func(*args, **kwargs) 2022-11-23T03:12:19.1847795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1847918Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1848199Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1848301Z self.run_subtests( 2022-11-23T03:12:19.1848641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1848751Z getattr(self, test_name)() 2022-11-23T03:12:19.1849090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1849240Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1849589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.1849677Z fn() 2022-11-23T03:12:19.1850025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1850165Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1850506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.1850619Z test(self, **param_kwargs) 2022-11-23T03:12:19.1850981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1851088Z output = model(*input) 2022-11-23T03:12:19.1851432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.1851545Z return func(*args, **kwargs) 2022-11-23T03:12:19.1851909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1852047Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1852324Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T03:12:19.1852426Z self.run_subtests( 2022-11-23T03:12:19.1852791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1852955Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1853290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.1853438Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.1853949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1854112Z _lazy_init(state, module) 2022-11-23T03:12:19.1854449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.1854583Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.1854906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1855033Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1855381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.1855484Z output = model(*input) 2022-11-23T03:12:19.1855796Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1855905Z return func(*args, **kwargs) 2022-11-23T03:12:19.1856199Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.1856328Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.1856681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1856769Z p_assert( 2022-11-23T03:12:19.1857115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.1857271Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.1857586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1857697Z traceback.print_stack() 2022-11-23T03:12:19.1858212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.1858323Z _lazy_init(state, module) 2022-11-23T03:12:19.1858662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.1858801Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.1859127Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.1859240Z return func(*args, **kwargs) 2022-11-23T03:12:19.1859604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.1859695Z p_assert( 2022-11-23T03:12:19.1860009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.1860123Z traceback.print_stack() 2022-11-23T03:12:19.1860223Z dist init r=3, world=4 2022-11-23T03:12:19.1860542Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1860928Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1861236Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1861528Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1861818Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1862108Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1862558Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1862877Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1863156Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1863433Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1863711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1864189Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.1864298Z dist init r=0, world=4 2022-11-23T03:12:19.1864601Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1864894Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1865180Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1865462Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1865740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1866020Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1866300Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1866578Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1866855Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1867133Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1867484Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1867772Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.1867866Z dist init r=1, world=4 2022-11-23T03:12:19.1868166Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1868457Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1868740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1869207Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1869554Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1869845Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1870134Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1870423Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1870711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1871006Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1871294Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1871582Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.1871681Z dist init r=2, world=4 2022-11-23T03:12:19.1871992Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1872292Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1872585Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1872877Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1873168Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1873456Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1873744Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1874077Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1874372Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1874658Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1875272Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1875563Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.1875653Z ok (6.222s) 2022-11-23T03:12:19.1876019Z test_transformer_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27978 2022-11-23T03:12:19.1876229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27979 2022-11-23T03:12:19.1876432Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27980 2022-11-23T03:12:19.1876636Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27981 2022-11-23T03:12:19.1877010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1877174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1877541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1877720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1878082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1878240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1878603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1878784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1879134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1879295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1879656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1879831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1880179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1880348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1880708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1880885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1881118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.1881348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.1881576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.1881962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.1882333Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1882753Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1883120Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1883480Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1883689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.1883898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.1884104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.1884309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.1884573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1884790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1884999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1885204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1886380Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1886482Z warnings.warn( 2022-11-23T03:12:19.1887482Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1887585Z warnings.warn( 2022-11-23T03:12:19.1888565Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1888662Z warnings.warn( 2022-11-23T03:12:19.1889847Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1889941Z warnings.warn( 2022-11-23T03:12:19.1890155Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1890367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1890578Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1890790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1891006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1891255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1891472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1891678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1892065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1892279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1892493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1892707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1892921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1893176Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1893392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1893603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1893814Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1894023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1894234Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1894446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1894657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1895017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1895229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1895433Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1895638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1895840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1896229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1896441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1896653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1896861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1897068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1897286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1897498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1897708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1897920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1898129Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1898341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1898550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1898754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1899124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1899373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1899582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1899790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1899994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1900199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1900399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1900595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1900979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1901238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1901449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1901660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1901871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1902082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1902293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1902505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1902709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1902925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1903141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1903353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1903561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1903773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1904335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1904549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1904745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1904950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1905152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1905364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1905566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1905770Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1905972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1906175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1906372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1906577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1906783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1906989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1907256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1907471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1907672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1908055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1908266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1908471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1908682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1908893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1909184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1909397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1909606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1909818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910236Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1910973Z dist init r=1, world=4 2022-11-23T03:12:19.1911077Z dist init r=3, world=4 2022-11-23T03:12:19.1911175Z dist init r=0, world=4 2022-11-23T03:12:19.1911272Z dist init r=2, world=4 2022-11-23T03:12:19.1911354Z ok (9.429s) 2022-11-23T03:12:19.1911681Z test_transformer_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28279 2022-11-23T03:12:19.1912047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28280 2022-11-23T03:12:19.1912245Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28281 2022-11-23T03:12:19.1912438Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28282 2022-11-23T03:12:19.1912797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1912955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1913314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1913481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1914002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1914168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1914531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1914709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1915061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1915222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1915634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1915809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1916165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1916327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1916853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1917024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1917249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.1917472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.1917692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.1917961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.1918329Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1918700Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1919061Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1919420Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1919631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.1919840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.1920052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.1920256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.1920471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1920679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1920894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1921104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1922078Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1922180Z warnings.warn( 2022-11-23T03:12:19.1923324Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1923423Z warnings.warn( 2022-11-23T03:12:19.1924458Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1924565Z warnings.warn( 2022-11-23T03:12:19.1925555Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1925652Z warnings.warn( 2022-11-23T03:12:19.1925875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1926243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1926505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1926708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1926921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1927128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1927335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1927721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1927936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1928147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1928364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1928573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1928785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1928996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1929207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1929418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1929631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1929840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1930051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1930262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1930636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1930839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1931042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1931246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1931450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1931653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1931857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1932368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1932661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1932879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1933091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1933302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1933513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1933722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1933936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1934146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1934350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1934611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1934824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1935034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1935244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1935615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1935992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1936203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1936408Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1936621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1936834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1937046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1937257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1937466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1937676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1937886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1938096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1938302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1938520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1938893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1939098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1939300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1939503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1939879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1940091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1940296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1940507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1940766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1940983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1941193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1941404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1941614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1941825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1942027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1942239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1942448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1942714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1942923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1943134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1943344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1943553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1943765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1944179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1944399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1944616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1944831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1945042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1945252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1945464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1945674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1945878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1946089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1946300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1946517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1946728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1946937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1947039Z dist init r=0, world=4 2022-11-23T03:12:19.1947137Z dist init r=1, world=4 2022-11-23T03:12:19.1947227Z dist init r=3, world=4 2022-11-23T03:12:19.1947323Z dist init r=2, world=4 2022-11-23T03:12:19.1947411Z ok (9.529s) 2022-11-23T03:12:19.1947746Z test_transformer_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28580 2022-11-23T03:12:19.1947954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28581 2022-11-23T03:12:19.1948157Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28582 2022-11-23T03:12:19.1948428Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28583 2022-11-23T03:12:19.1948815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1948973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1949338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1949517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1949868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1950031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1950392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1950633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1950994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1951153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1951515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1951691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1952041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1952204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1952565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1952741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1952981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.1953213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.1953433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.1953661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.1954211Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1954582Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1954947Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1955316Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1955526Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.1955733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.1955939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.1956136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.1956353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1956565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1956774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1956983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1958004Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1958109Z warnings.warn( 2022-11-23T03:12:19.1959082Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1959220Z warnings.warn( 2022-11-23T03:12:19.1960405Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1960505Z warnings.warn( 2022-11-23T03:12:19.1961501Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1961606Z warnings.warn( 2022-11-23T03:12:19.1961827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1962042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1962261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1962478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1962691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1962905Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1963118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1963494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1963709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1963908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1964114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1964319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1964525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1964731Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1964935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1965139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1965349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1965590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1965801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1966100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1966308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1966512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1966717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1966921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1967125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1967395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1967592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1967795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1967998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1968200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1968402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1968605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1968811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1969017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1969221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1969425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1969629Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1969830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1970032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1970235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1970439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1970641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1971013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1971232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1971444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1971654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1971863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1972073Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1972286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1972497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1972707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1972916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1973180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1973399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1973609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1973819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1974192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1974393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1974595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1974792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1975043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1975245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1975625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1975837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1976049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1976258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1976467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1976676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1976881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1977098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1977309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1977521Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1977733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1977941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1978151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1978362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1978567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1978778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1978996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1979208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1979421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1979630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1979840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1980051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1980256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1980465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1980681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1981092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1981308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1981511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1981714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1982095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1982197Z dist init r=1, world=4 2022-11-23T03:12:19.1982293Z dist init r=0, world=4 2022-11-23T03:12:19.1982389Z dist init r=3, world=4 2022-11-23T03:12:19.1982487Z dist init r=2, world=4 2022-11-23T03:12:19.1982576Z ok (9.529s) 2022-11-23T03:12:19.1982903Z test_transformer_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28881 2022-11-23T03:12:19.1983192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28882 2022-11-23T03:12:19.1983394Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28883 2022-11-23T03:12:19.1983589Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28884 2022-11-23T03:12:19.1984159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1984334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1984704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1985044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1985384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1985549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1985902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1986069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1986412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1986570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1986918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1987089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1987427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.1987587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.1987941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.1988114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.1988331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.1988554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.1988776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.1988996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.1989418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1989858Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1990234Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1990593Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.1990804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.1991003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.1991213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.1991417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.1991632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1991909Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1992293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1992512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.1993522Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1993625Z warnings.warn( 2022-11-23T03:12:19.1994623Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1994728Z warnings.warn( 2022-11-23T03:12:19.1996048Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1996140Z warnings.warn( 2022-11-23T03:12:19.1997137Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.1997240Z warnings.warn( 2022-11-23T03:12:19.1997361Z File "", line 1, in 2022-11-23T03:12:19.1997563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.1997694Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.1997885Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.1998023Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.1998225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.1998313Z self.run() 2022-11-23T03:12:19.1998549Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.1998691Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.1999025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.1999147Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.1999647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.1999754Z getattr(self, test_name)() 2022-11-23T03:12:19.2000088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2000166Z fn() 2022-11-23T03:12:19.2000506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2000613Z test(self, **param_kwargs) 2022-11-23T03:12:19.2000996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2001284Z return func(*args, **kwargs) 2022-11-23T03:12:19.2001513Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2001614Z self.run_subtests( 2022-11-23T03:12:19.2001959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2002104Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2002455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2002596Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2002960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2003073Z output = model(*input) 2022-11-23T03:12:19.2003388Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2003520Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2003886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2004044Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2004398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2004507Z _lazy_init(state, module) 2022-11-23T03:12:19.2004847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2004979Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2005308Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2005426Z return func(*args, **kwargs) 2022-11-23T03:12:19.2005956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2006039Z p_assert( 2022-11-23T03:12:19.2006354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2006464Z traceback.print_stack() 2022-11-23T03:12:19.2006576Z File "", line 1, in 2022-11-23T03:12:19.2006767Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2007071Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2007260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2007399Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2007591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2007687Z self.run() 2022-11-23T03:12:19.2007923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2008064Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2008390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2008511Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2008858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2008964Z getattr(self, test_name)() 2022-11-23T03:12:19.2009308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2009395Z fn() 2022-11-23T03:12:19.2009745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2009855Z test(self, **param_kwargs) 2022-11-23T03:12:19.2010253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2010367Z return func(*args, **kwargs) 2022-11-23T03:12:19.2010593Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2010689Z self.run_subtests( 2022-11-23T03:12:19.2011029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2011181Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2011529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2011669Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2012027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2012138Z output = model(*input) 2022-11-23T03:12:19.2012452Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2012575Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2013099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2013257Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2013600Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2013879Z _lazy_init(state, module) 2022-11-23T03:12:19.2014221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2014353Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2014676Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2014787Z return func(*args, **kwargs) 2022-11-23T03:12:19.2015156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2015247Z p_assert( 2022-11-23T03:12:19.2015570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2015683Z traceback.print_stack() 2022-11-23T03:12:19.2015799Z File "", line 1, in 2022-11-23T03:12:19.2015996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2016127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2016313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2016453Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2016652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2016748Z self.run() 2022-11-23T03:12:19.2016986Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2017125Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2017452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2017567Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2017915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2018026Z getattr(self, test_name)() 2022-11-23T03:12:19.2018370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2018457Z fn() 2022-11-23T03:12:19.2018806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2018917Z test(self, **param_kwargs) 2022-11-23T03:12:19.2019476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2019580Z return func(*args, **kwargs) 2022-11-23T03:12:19.2019798Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2019896Z self.run_subtests( 2022-11-23T03:12:19.2020226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2020369Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2020704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2020839Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2021187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2021289Z output = model(*input) 2022-11-23T03:12:19.2021592Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2021716Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2022066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2022225Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2022564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2022668Z _lazy_init(state, module) 2022-11-23T03:12:19.2022995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2023115Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2023426Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2023539Z return func(*args, **kwargs) 2022-11-23T03:12:19.2024095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2024197Z p_assert( 2022-11-23T03:12:19.2024514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2024625Z traceback.print_stack() 2022-11-23T03:12:19.2024739Z File "", line 1, in 2022-11-23T03:12:19.2024925Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2025050Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2025232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2025364Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2025726Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2025823Z self.run() 2022-11-23T03:12:19.2026078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2026225Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2026548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2026670Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2027016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2027126Z getattr(self, test_name)() 2022-11-23T03:12:19.2027471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2027558Z fn() 2022-11-23T03:12:19.2027907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2028015Z test(self, **param_kwargs) 2022-11-23T03:12:19.2028438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2028553Z return func(*args, **kwargs) 2022-11-23T03:12:19.2028779Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2028880Z self.run_subtests( 2022-11-23T03:12:19.2029218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2029367Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2029715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2029855Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2030217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2030328Z output = model(*input) 2022-11-23T03:12:19.2030643Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2030935Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2031284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2031442Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2031782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2031886Z _lazy_init(state, module) 2022-11-23T03:12:19.2032204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2032330Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2032963Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2033079Z return func(*args, **kwargs) 2022-11-23T03:12:19.2033449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2033541Z p_assert( 2022-11-23T03:12:19.2033862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2033970Z traceback.print_stack() 2022-11-23T03:12:19.2034197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2034419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2034637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2034853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2034970Z File "", line 1, in 2022-11-23T03:12:19.2035221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2035357Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2035540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2035678Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2036039Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2036128Z self.run() 2022-11-23T03:12:19.2036487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2036621Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2036952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2037073Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2037416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2037574Z getattr(self, test_name)() 2022-11-23T03:12:19.2037923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2038009Z fn() 2022-11-23T03:12:19.2038360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2038471Z test(self, **param_kwargs) 2022-11-23T03:12:19.2038814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2038926Z return func(*args, **kwargs) 2022-11-23T03:12:19.2039305Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2039403Z self.run_subtests( 2022-11-23T03:12:19.2039731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2039879Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2040400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2040543Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2040905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2041013Z output = model(*input) 2022-11-23T03:12:19.2041321Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2041450Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2041816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2041979Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2042330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2042446Z _lazy_init(state, module) 2022-11-23T03:12:19.2042784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2042914Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2043241Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2043407Z return func(*args, **kwargs) 2022-11-23T03:12:19.2043780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2043872Z p_assert( 2022-11-23T03:12:19.2044194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2044459Z traceback.print_stack() 2022-11-23T03:12:19.2044573Z File "", line 1, in 2022-11-23T03:12:19.2044814Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2044941Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2045124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2045256Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2045449Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2045537Z self.run() 2022-11-23T03:12:19.2045719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2045849Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2046158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2046275Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2046608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2046763Z getattr(self, test_name)() 2022-11-23T03:12:19.2047100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2047183Z fn() 2022-11-23T03:12:19.2047697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2047808Z test(self, **param_kwargs) 2022-11-23T03:12:19.2048143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2048256Z return func(*args, **kwargs) 2022-11-23T03:12:19.2048481Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2048582Z self.run_subtests( 2022-11-23T03:12:19.2048922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2049074Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2049427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2049569Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2049925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2050034Z output = model(*input) 2022-11-23T03:12:19.2050343Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2050470Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2050830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2050994Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2051352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2051461Z _lazy_init(state, module) 2022-11-23T03:12:19.2051792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2051923Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2052246Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2052358Z return func(*args, **kwargs) 2022-11-23T03:12:19.2052721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2052812Z p_assert( 2022-11-23T03:12:19.2053131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2053244Z traceback.print_stack() 2022-11-23T03:12:19.2053355Z File "", line 1, in 2022-11-23T03:12:19.2053606Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2053744Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2053934Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2054072Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2054271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2054362Z self.run() 2022-11-23T03:12:19.2054545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2054679Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2055004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2055124Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2055469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2055789Z getattr(self, test_name)() 2022-11-23T03:12:19.2056128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2056210Z fn() 2022-11-23T03:12:19.2056540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2056648Z test(self, **param_kwargs) 2022-11-23T03:12:19.2056977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2057084Z return func(*args, **kwargs) 2022-11-23T03:12:19.2057301Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2057398Z self.run_subtests( 2022-11-23T03:12:19.2057722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2057870Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2058201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2058339Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2058688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2058791Z output = model(*input) 2022-11-23T03:12:19.2059264Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2059394Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2059757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2059920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2060276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2060387Z _lazy_init(state, module) 2022-11-23T03:12:19.2060727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2060857Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2061180Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2061292Z return func(*args, **kwargs) 2022-11-23T03:12:19.2061656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2061746Z p_assert( 2022-11-23T03:12:19.2062060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2062174Z traceback.print_stack() 2022-11-23T03:12:19.2062293Z File "", line 1, in 2022-11-23T03:12:19.2062540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2062677Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2062867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2063166Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2063360Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2063442Z self.run() 2022-11-23T03:12:19.2063625Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2063754Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2064274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2064393Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2064729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2064915Z getattr(self, test_name)() 2022-11-23T03:12:19.2065250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2065327Z fn() 2022-11-23T03:12:19.2065664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2065772Z test(self, **param_kwargs) 2022-11-23T03:12:19.2066101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2066210Z return func(*args, **kwargs) 2022-11-23T03:12:19.2066431Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2066529Z self.run_subtests( 2022-11-23T03:12:19.2066847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2066998Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2067342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2067478Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2067829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2067932Z output = model(*input) 2022-11-23T03:12:19.2068233Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2068356Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2068698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2068859Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2069205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2069314Z _lazy_init(state, module) 2022-11-23T03:12:19.2069639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2069767Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2070080Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2070188Z return func(*args, **kwargs) 2022-11-23T03:12:19.2070720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2070806Z p_assert( 2022-11-23T03:12:19.2071134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2071248Z traceback.print_stack() 2022-11-23T03:12:19.2071477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2071763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2071993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2072213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2072323Z File "", line 1, in 2022-11-23T03:12:19.2072524Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2072652Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2072840Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2072978Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2073176Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2073320Z self.run() 2022-11-23T03:12:19.2073514Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2073641Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2073970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2074090Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2074436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2074548Z getattr(self, test_name)() 2022-11-23T03:12:19.2074893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2074977Z fn() 2022-11-23T03:12:19.2075479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2075580Z test(self, **param_kwargs) 2022-11-23T03:12:19.2076095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2076210Z return func(*args, **kwargs) 2022-11-23T03:12:19.2076437Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2076538Z self.run_subtests( 2022-11-23T03:12:19.2076877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2077026Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2077378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2077513Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2077876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2077984Z output = model(*input) 2022-11-23T03:12:19.2078303Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2078434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2078797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2078962Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2079314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2079417Z _lazy_init(state, module) 2022-11-23T03:12:19.2079754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2079884Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2080209Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2080325Z return func(*args, **kwargs) 2022-11-23T03:12:19.2080736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2080833Z p_assert( 2022-11-23T03:12:19.2081160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2081266Z traceback.print_stack() 2022-11-23T03:12:19.2081384Z File "", line 1, in 2022-11-23T03:12:19.2081580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2081710Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2081900Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2082038Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2082235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2082371Z self.run() 2022-11-23T03:12:19.2082565Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2082699Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2083353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2083476Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2083827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2083937Z getattr(self, test_name)() 2022-11-23T03:12:19.2084283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2084363Z fn() 2022-11-23T03:12:19.2084717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2084829Z test(self, **param_kwargs) 2022-11-23T03:12:19.2085180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2085296Z return func(*args, **kwargs) 2022-11-23T03:12:19.2085524Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2085626Z self.run_subtests( 2022-11-23T03:12:19.2085965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2086110Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2086460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2086599Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2086961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2087072Z output = model(*input) 2022-11-23T03:12:19.2087391Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2087680Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2088030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2088181Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2088521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2088626Z _lazy_init(state, module) 2022-11-23T03:12:19.2088951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2089075Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2089451Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2089565Z return func(*args, **kwargs) 2022-11-23T03:12:19.2089968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2090055Z p_assert( 2022-11-23T03:12:19.2090547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2090662Z traceback.print_stack() 2022-11-23T03:12:19.2090778Z File "", line 1, in 2022-11-23T03:12:19.2090977Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2091106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2091295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2091433Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2091627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2091780Z self.run() 2022-11-23T03:12:19.2091972Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2092106Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2092436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2092555Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2092901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2093004Z getattr(self, test_name)() 2022-11-23T03:12:19.2093349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2093435Z fn() 2022-11-23T03:12:19.2093788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2093898Z test(self, **param_kwargs) 2022-11-23T03:12:19.2094249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2094362Z return func(*args, **kwargs) 2022-11-23T03:12:19.2094588Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2094683Z self.run_subtests( 2022-11-23T03:12:19.2095022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2095171Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2096130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2096309Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2096853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2096969Z output = model(*input) 2022-11-23T03:12:19.2097863Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2097994Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2098362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2098526Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2098879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2098990Z _lazy_init(state, module) 2022-11-23T03:12:19.2099335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2099469Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2099793Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2099906Z return func(*args, **kwargs) 2022-11-23T03:12:19.2100351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2100450Z p_assert( 2022-11-23T03:12:19.2100779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2100895Z traceback.print_stack() 2022-11-23T03:12:19.2101017Z File "", line 1, in 2022-11-23T03:12:19.2101214Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2101345Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2101531Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2101673Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2101876Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2102036Z self.run() 2022-11-23T03:12:19.2102231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2102367Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2102696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2102812Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2103159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2103271Z getattr(self, test_name)() 2022-11-23T03:12:19.2103616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2103705Z fn() 2022-11-23T03:12:19.2104295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2104411Z test(self, **param_kwargs) 2022-11-23T03:12:19.2104769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2104876Z return func(*args, **kwargs) 2022-11-23T03:12:19.2105103Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2105206Z self.run_subtests( 2022-11-23T03:12:19.2105546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2105695Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2106044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2106186Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2106544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2106650Z output = model(*input) 2022-11-23T03:12:19.2106965Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2107094Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2107457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2107621Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2107973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2108082Z _lazy_init(state, module) 2022-11-23T03:12:19.2108418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2108549Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2108865Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2108981Z return func(*args, **kwargs) 2022-11-23T03:12:19.2109422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2109524Z p_assert( 2022-11-23T03:12:19.2109849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2109963Z traceback.print_stack() 2022-11-23T03:12:19.2110188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2110405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2110628Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2110848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2110965Z File "", line 1, in 2022-11-23T03:12:19.2111227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2111361Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2111551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2111689Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2111882Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2111973Z self.run() 2022-11-23T03:12:19.2112163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2112296Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2112627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2112748Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2113100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2113215Z getattr(self, test_name)() 2022-11-23T03:12:19.2113561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2113648Z fn() 2022-11-23T03:12:19.2114000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2114111Z test(self, **param_kwargs) 2022-11-23T03:12:19.2114453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2114565Z return func(*args, **kwargs) 2022-11-23T03:12:19.2114791Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2114894Z self.run_subtests( 2022-11-23T03:12:19.2115225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2115379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2115730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2115873Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2116231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2116339Z output = model(*input) 2022-11-23T03:12:19.2116650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2116778Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2117134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2117298Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2117651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2117819Z _lazy_init(state, module) 2022-11-23T03:12:19.2118168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2118299Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2118622Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2118734Z return func(*args, **kwargs) 2022-11-23T03:12:19.2119092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2119183Z p_assert( 2022-11-23T03:12:19.2119504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2119617Z traceback.print_stack() 2022-11-23T03:12:19.2119733Z File "", line 1, in 2022-11-23T03:12:19.2119981Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2120115Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2120297Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2120439Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2120638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2120729Z self.run() 2022-11-23T03:12:19.2120920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2121053Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2121378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2121499Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2121840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2121955Z getattr(self, test_name)() 2022-11-23T03:12:19.2122305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2122392Z fn() 2022-11-23T03:12:19.2122743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2122854Z test(self, **param_kwargs) 2022-11-23T03:12:19.2123194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2123307Z return func(*args, **kwargs) 2022-11-23T03:12:19.2123529Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2123633Z self.run_subtests( 2022-11-23T03:12:19.2123971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2124125Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2124478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2124620Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2124982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2125089Z output = model(*input) 2022-11-23T03:12:19.2125395Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2125524Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2125887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2126050Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2126397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2126592Z _lazy_init(state, module) 2022-11-23T03:12:19.2126940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2127072Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2127389Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2127503Z return func(*args, **kwargs) 2022-11-23T03:12:19.2127870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2127960Z p_assert( 2022-11-23T03:12:19.2128283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2128397Z traceback.print_stack() 2022-11-23T03:12:19.2128515Z File "", line 1, in 2022-11-23T03:12:19.2128764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2128890Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2129082Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2129221Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2129422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2129514Z self.run() 2022-11-23T03:12:19.2129705Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2129840Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2130163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2130284Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2130630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2130745Z getattr(self, test_name)() 2022-11-23T03:12:19.2131257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2131343Z fn() 2022-11-23T03:12:19.2131683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2131791Z test(self, **param_kwargs) 2022-11-23T03:12:19.2132115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2132221Z return func(*args, **kwargs) 2022-11-23T03:12:19.2132439Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2132545Z self.run_subtests( 2022-11-23T03:12:19.2132920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2133246Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2133600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2133740Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2134093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2134204Z output = model(*input) 2022-11-23T03:12:19.2134518Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2134652Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2135016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2135182Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2135536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2135697Z _lazy_init(state, module) 2022-11-23T03:12:19.2136038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2136332Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2136818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2136931Z return func(*args, **kwargs) 2022-11-23T03:12:19.2137299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2137392Z p_assert( 2022-11-23T03:12:19.2137714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2137826Z traceback.print_stack() 2022-11-23T03:12:19.2137938Z File "", line 1, in 2022-11-23T03:12:19.2138186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2138315Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2138504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2138642Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2138840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2138931Z self.run() 2022-11-23T03:12:19.2139116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2139256Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2139745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2139864Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2140201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2140311Z getattr(self, test_name)() 2022-11-23T03:12:19.2140830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2140921Z fn() 2022-11-23T03:12:19.2141263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2141379Z test(self, **param_kwargs) 2022-11-23T03:12:19.2141717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2141830Z return func(*args, **kwargs) 2022-11-23T03:12:19.2142055Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2142156Z self.run_subtests( 2022-11-23T03:12:19.2142494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2142647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2142995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2143138Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2143502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2143771Z output = model(*input) 2022-11-23T03:12:19.2144288Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2144413Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2144763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2144920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2145513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2145634Z _lazy_init(state, module) 2022-11-23T03:12:19.2145976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2146106Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2146431Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2146543Z return func(*args, **kwargs) 2022-11-23T03:12:19.2146906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2146997Z p_assert( 2022-11-23T03:12:19.2147315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2147432Z traceback.print_stack() 2022-11-23T03:12:19.2147657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2147946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2148166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2148385Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2148501Z File "", line 1, in 2022-11-23T03:12:19.2148700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2148824Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2149016Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2149153Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2149354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2149446Z self.run() 2022-11-23T03:12:19.2149639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2149776Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2150110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2150225Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2150571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2150684Z getattr(self, test_name)() 2022-11-23T03:12:19.2151030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2151115Z fn() 2022-11-23T03:12:19.2151466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2151578Z test(self, **param_kwargs) 2022-11-23T03:12:19.2151921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2152033Z return func(*args, **kwargs) 2022-11-23T03:12:19.2152262Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2152363Z self.run_subtests( 2022-11-23T03:12:19.2152702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2152852Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2153200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2153343Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2153705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2153810Z output = model(*input) 2022-11-23T03:12:19.2154169Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2154304Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2154831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2154991Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2155330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2155435Z _lazy_init(state, module) 2022-11-23T03:12:19.2155762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2155881Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2156193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2156366Z return func(*args, **kwargs) 2022-11-23T03:12:19.2156727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2156816Z p_assert( 2022-11-23T03:12:19.2157128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2157240Z traceback.print_stack() 2022-11-23T03:12:19.2157354Z File "", line 1, in 2022-11-23T03:12:19.2157540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2157668Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2157851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2157985Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2158176Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2158264Z self.run() 2022-11-23T03:12:19.2158634Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2158767Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2159094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2159260Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2159767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2159879Z getattr(self, test_name)() 2022-11-23T03:12:19.2160225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2160313Z fn() 2022-11-23T03:12:19.2160884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2161085Z test(self, **param_kwargs) 2022-11-23T03:12:19.2161692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2161817Z return func(*args, **kwargs) 2022-11-23T03:12:19.2162042Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2162145Z self.run_subtests( 2022-11-23T03:12:19.2162487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2162639Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2162991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2163127Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2163485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2163593Z output = model(*input) 2022-11-23T03:12:19.2163970Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2164107Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2164473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2164638Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2164990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2165093Z _lazy_init(state, module) 2022-11-23T03:12:19.2165431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2165562Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2165883Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2166045Z return func(*args, **kwargs) 2022-11-23T03:12:19.2166415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2166506Z p_assert( 2022-11-23T03:12:19.2166832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2166940Z traceback.print_stack() 2022-11-23T03:12:19.2167059Z File "", line 1, in 2022-11-23T03:12:19.2167257Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2167388Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2167579Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2167716Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2167916Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2168006Z self.run() 2022-11-23T03:12:19.2168202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2168336Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2168660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2168780Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2169127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2169239Z getattr(self, test_name)() 2022-11-23T03:12:19.2169584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2169664Z fn() 2022-11-23T03:12:19.2170014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2170126Z test(self, **param_kwargs) 2022-11-23T03:12:19.2170247Z File "", line 1, in 2022-11-23T03:12:19.2170592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2170706Z return func(*args, **kwargs) 2022-11-23T03:12:19.2170931Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2171033Z self.run_subtests( 2022-11-23T03:12:19.2171226Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2171357Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2171696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2171847Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2172037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2172175Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2172578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2172729Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2172923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2173016Z self.run() 2022-11-23T03:12:19.2173379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2173488Z output = model(*input) 2022-11-23T03:12:19.2173681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2173813Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2174121Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2174245Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2174621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2174745Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2175111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2175274Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2175617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2175728Z getattr(self, test_name)() 2022-11-23T03:12:19.2176077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2176185Z _lazy_init(state, module) 2022-11-23T03:12:19.2176523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2176609Z fn() 2022-11-23T03:12:19.2176952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2177083Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2177434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2177544Z test(self, **param_kwargs) 2022-11-23T03:12:19.2177865Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2177971Z return func(*args, **kwargs) 2022-11-23T03:12:19.2178316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2178428Z return func(*args, **kwargs) 2022-11-23T03:12:19.2178793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2178885Z p_assert( 2022-11-23T03:12:19.2179120Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2179223Z self.run_subtests( 2022-11-23T03:12:19.2179546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2179654Z traceback.print_stack() 2022-11-23T03:12:19.2179995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2180149Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2180498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2180639Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2180999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2181109Z output = model(*input) 2022-11-23T03:12:19.2181464Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2181592Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2181961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2182123Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2182475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2182584Z _lazy_init(state, module) 2022-11-23T03:12:19.2182922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2183053Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2183376Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2183529Z return func(*args, **kwargs) 2022-11-23T03:12:19.2184117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2184223Z p_assert( 2022-11-23T03:12:19.2184548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2184661Z traceback.print_stack() 2022-11-23T03:12:19.2184886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2185110Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2185330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2185543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2185663Z File "", line 1, in 2022-11-23T03:12:19.2185866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2185998Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2186190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2186329Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2186529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2186621Z self.run() 2022-11-23T03:12:19.2186807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2186941Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2187271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2187394Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2187742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2187857Z getattr(self, test_name)() 2022-11-23T03:12:19.2188206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2188292Z fn() 2022-11-23T03:12:19.2188638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2188749Z test(self, **param_kwargs) 2022-11-23T03:12:19.2189091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2189251Z return func(*args, **kwargs) 2022-11-23T03:12:19.2189480Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2189581Z self.run_subtests( 2022-11-23T03:12:19.2189922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2190077Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2190501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2190655Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2191058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2191166Z output = model(*input) 2022-11-23T03:12:19.2191477Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2191606Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2191969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2192133Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2192481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2192820Z _lazy_init(state, module) 2022-11-23T03:12:19.2193341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2193473Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2193801Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2193914Z return func(*args, **kwargs) 2022-11-23T03:12:19.2194278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2194371Z p_assert( 2022-11-23T03:12:19.2194688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2194804Z traceback.print_stack() 2022-11-23T03:12:19.2194921Z File "", line 1, in 2022-11-23T03:12:19.2195122Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2195256Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2195446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2195585Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2195784Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2195878Z self.run() 2022-11-23T03:12:19.2196226Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2196360Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2196679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2196795Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2197309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2197425Z getattr(self, test_name)() 2022-11-23T03:12:19.2197767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2197855Z fn() 2022-11-23T03:12:19.2198208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2198319Z test(self, **param_kwargs) 2022-11-23T03:12:19.2198661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2198775Z return func(*args, **kwargs) 2022-11-23T03:12:19.2199001Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2199102Z self.run_subtests( 2022-11-23T03:12:19.2199434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2199593Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2199991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2200293Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2200647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2200752Z output = model(*input) 2022-11-23T03:12:19.2201051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2201176Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2201518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2201678Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2202206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2202367Z _lazy_init(state, module) 2022-11-23T03:12:19.2202708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2202842Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2203167Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2203284Z return func(*args, **kwargs) 2022-11-23T03:12:19.2203642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2203733Z p_assert( 2022-11-23T03:12:19.2204057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2204173Z traceback.print_stack() 2022-11-23T03:12:19.2204290Z File "", line 1, in 2022-11-23T03:12:19.2204492Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2204626Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2204815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2204953Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2205311Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2205401Z self.run() 2022-11-23T03:12:19.2205585Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2205714Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2206030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2206326Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2206670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2206788Z getattr(self, test_name)() 2022-11-23T03:12:19.2207137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2207224Z fn() 2022-11-23T03:12:19.2207576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2207687Z test(self, **param_kwargs) 2022-11-23T03:12:19.2208032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2208145Z return func(*args, **kwargs) 2022-11-23T03:12:19.2208364Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2208466Z self.run_subtests( 2022-11-23T03:12:19.2208805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2208960Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2209359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2209506Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2209869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2209977Z output = model(*input) 2022-11-23T03:12:19.2210281Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2210410Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2210772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2210937Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2211291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2211455Z _lazy_init(state, module) 2022-11-23T03:12:19.2211796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2211925Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2212242Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2212355Z return func(*args, **kwargs) 2022-11-23T03:12:19.2212723Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2212812Z p_assert( 2022-11-23T03:12:19.2213133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2213401Z traceback.print_stack() 2022-11-23T03:12:19.2213517Z File "", line 1, in 2022-11-23T03:12:19.2213709Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2213831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2214014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2214148Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2214342Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2214430Z self.run() 2022-11-23T03:12:19.2214613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2214742Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2215235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2215358Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2215705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2215819Z getattr(self, test_name)() 2022-11-23T03:12:19.2216294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2216383Z fn() 2022-11-23T03:12:19.2216735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2216847Z test(self, **param_kwargs) 2022-11-23T03:12:19.2217186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2217304Z return func(*args, **kwargs) 2022-11-23T03:12:19.2217529Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2217631Z self.run_subtests( 2022-11-23T03:12:19.2218127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2218274Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2218656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2218799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2219142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2219248Z output = model(*input) 2022-11-23T03:12:19.2219551Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2219675Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2220028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2220186Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2220526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2220697Z _lazy_init(state, module) 2022-11-23T03:12:19.2221021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2221153Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2221465Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2221572Z return func(*args, **kwargs) 2022-11-23T03:12:19.2221925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2222012Z p_assert( 2022-11-23T03:12:19.2222324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2222435Z traceback.print_stack() 2022-11-23T03:12:19.2222650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2222874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2223087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2223300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2223589Z File "", line 1, in 2022-11-23T03:12:19.2223788Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2224207Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2224409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2224543Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2224745Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2224838Z self.run() 2022-11-23T03:12:19.2225034Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2225171Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2225504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2225625Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2225973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2226078Z getattr(self, test_name)() 2022-11-23T03:12:19.2226424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2226510Z fn() 2022-11-23T03:12:19.2226857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2226967Z test(self, **param_kwargs) 2022-11-23T03:12:19.2227305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2227528Z return func(*args, **kwargs) 2022-11-23T03:12:19.2227760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2227864Z self.run_subtests( 2022-11-23T03:12:19.2228203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2228352Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2228704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2228845Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2229206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2229313Z output = model(*input) 2022-11-23T03:12:19.2229696Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2229823Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2230188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2230353Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2230706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2230814Z _lazy_init(state, module) 2022-11-23T03:12:19.2231149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2231280Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2231604Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2231709Z return func(*args, **kwargs) 2022-11-23T03:12:19.2232082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2232175Z p_assert( 2022-11-23T03:12:19.2232499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2232640Z traceback.print_stack() 2022-11-23T03:12:19.2232783Z File "", line 1, in 2022-11-23T03:12:19.2232981Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2233107Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2233633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2233773Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2233974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2234066Z self.run() 2022-11-23T03:12:19.2234262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2234402Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2234733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2234850Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2235199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2235315Z getattr(self, test_name)() 2022-11-23T03:12:19.2235660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2235748Z fn() 2022-11-23T03:12:19.2236100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2236209Z test(self, **param_kwargs) 2022-11-23T03:12:19.2236556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2236871Z return func(*args, **kwargs) 2022-11-23T03:12:19.2237277Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2237427Z self.run_subtests( 2022-11-23T03:12:19.2237813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2237963Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2238311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2238451Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2238811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2238912Z output = model(*input) 2022-11-23T03:12:19.2239281Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2239409Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2239769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2239933Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2240284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2240393Z _lazy_init(state, module) 2022-11-23T03:12:19.2240729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2240855Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2241177Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2241293Z return func(*args, **kwargs) 2022-11-23T03:12:19.2241659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2241750Z p_assert( 2022-11-23T03:12:19.2242072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2242186Z traceback.print_stack() 2022-11-23T03:12:19.2242302Z File "", line 1, in 2022-11-23T03:12:19.2242494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2242624Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2242815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2242954Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2243152Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2243244Z self.run() 2022-11-23T03:12:19.2243441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2243574Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2243899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2244021Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2244369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2244480Z getattr(self, test_name)() 2022-11-23T03:12:19.2244824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2244913Z fn() 2022-11-23T03:12:19.2245266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2245372Z test(self, **param_kwargs) 2022-11-23T03:12:19.2245715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2245879Z return func(*args, **kwargs) 2022-11-23T03:12:19.2246112Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2246214Z self.run_subtests( 2022-11-23T03:12:19.2246556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2246867Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2247208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2247339Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2247688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2247796Z output = model(*input) 2022-11-23T03:12:19.2248324Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2248457Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2248819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2248981Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2249333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2249436Z _lazy_init(state, module) 2022-11-23T03:12:19.2249775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2249905Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2250231Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2250347Z return func(*args, **kwargs) 2022-11-23T03:12:19.2250713Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2250805Z p_assert( 2022-11-23T03:12:19.2251128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2251238Z traceback.print_stack() 2022-11-23T03:12:19.2251357Z File "", line 1, in 2022-11-23T03:12:19.2251553Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2251684Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2251874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2252013Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2252214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2252302Z self.run() 2022-11-23T03:12:19.2252504Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2252639Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2252963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2253085Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2253432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2253546Z getattr(self, test_name)() 2022-11-23T03:12:19.2253892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2253972Z fn() 2022-11-23T03:12:19.2254324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2254435Z test(self, **param_kwargs) 2022-11-23T03:12:19.2254782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2255222Z return func(*args, **kwargs) 2022-11-23T03:12:19.2255521Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2255620Z self.run_subtests( 2022-11-23T03:12:19.2255949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2256092Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2256430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2256568Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2256917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2257023Z output = model(*input) 2022-11-23T03:12:19.2257377Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2257502Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2257853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2258005Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2258346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2258453Z _lazy_init(state, module) 2022-11-23T03:12:19.2258779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2258908Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2259220Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2259331Z return func(*args, **kwargs) 2022-11-23T03:12:19.2259685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2259767Z p_assert( 2022-11-23T03:12:19.2260077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2260189Z traceback.print_stack() 2022-11-23T03:12:19.2260586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2260807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2261027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2261245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2261363Z File "", line 1, in 2022-11-23T03:12:19.2261554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2261693Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2261882Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2262022Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2262220Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2262311Z self.run() 2022-11-23T03:12:19.2262499Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2262635Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2262958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2263081Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2263427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2263542Z getattr(self, test_name)() 2022-11-23T03:12:19.2264495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2264595Z fn() 2022-11-23T03:12:19.2264954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2265059Z test(self, **param_kwargs) 2022-11-23T03:12:19.2265401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2265513Z return func(*args, **kwargs) 2022-11-23T03:12:19.2265740Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2265841Z self.run_subtests( 2022-11-23T03:12:19.2266294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2266446Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2266913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2267047Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2267406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2267513Z output = model(*input) 2022-11-23T03:12:19.2267986Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2268111Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2268463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2268620Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2268958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2269070Z _lazy_init(state, module) 2022-11-23T03:12:19.2269392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2269521Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2269838Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2269947Z return func(*args, **kwargs) 2022-11-23T03:12:19.2270299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2270385Z p_assert( 2022-11-23T03:12:19.2270698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2270802Z traceback.print_stack() 2022-11-23T03:12:19.2270914Z File "", line 1, in 2022-11-23T03:12:19.2271102Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2271235Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2271419Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2271728Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2271927Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2272020Z self.run() 2022-11-23T03:12:19.2272202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2272334Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2272663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2272784Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2273131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2273246Z getattr(self, test_name)() 2022-11-23T03:12:19.2273640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2273733Z fn() 2022-11-23T03:12:19.2274081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2274191Z test(self, **param_kwargs) 2022-11-23T03:12:19.2274544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2274823Z return func(*args, **kwargs) 2022-11-23T03:12:19.2275051Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2275158Z self.run_subtests( 2022-11-23T03:12:19.2275495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2275695Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2276209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2283121Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2283575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2283689Z output = model(*input) 2022-11-23T03:12:19.2284013Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2284145Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2284512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2284681Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2285039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2285162Z _lazy_init(state, module) 2022-11-23T03:12:19.2285656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2285786Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2286103Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2286213Z return func(*args, **kwargs) 2022-11-23T03:12:19.2286564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2286652Z p_assert( 2022-11-23T03:12:19.2286966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2287078Z traceback.print_stack() 2022-11-23T03:12:19.2287191Z File "", line 1, in 2022-11-23T03:12:19.2287389Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2287518Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2287696Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2287833Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2288026Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2288115Z self.run() 2022-11-23T03:12:19.2288299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2288427Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2288744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2288861Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2289255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2289375Z getattr(self, test_name)() 2022-11-23T03:12:19.2289814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2289907Z fn() 2022-11-23T03:12:19.2290254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2290367Z test(self, **param_kwargs) 2022-11-23T03:12:19.2290700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2290811Z return func(*args, **kwargs) 2022-11-23T03:12:19.2291023Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2291121Z self.run_subtests( 2022-11-23T03:12:19.2291449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2291668Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2292011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2292149Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2292499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2292606Z output = model(*input) 2022-11-23T03:12:19.2292902Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2293027Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2293564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2293731Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2294088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2294206Z _lazy_init(state, module) 2022-11-23T03:12:19.2294547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2294680Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2294999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2295112Z return func(*args, **kwargs) 2022-11-23T03:12:19.2295478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2295570Z p_assert( 2022-11-23T03:12:19.2295894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2296010Z traceback.print_stack() 2022-11-23T03:12:19.2296128Z File "", line 1, in 2022-11-23T03:12:19.2296334Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2296462Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2296654Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2296794Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2296992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2297083Z self.run() 2022-11-23T03:12:19.2297272Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2297406Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2297727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2297848Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2298196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2298311Z getattr(self, test_name)() 2022-11-23T03:12:19.2298708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2298805Z fn() 2022-11-23T03:12:19.2299162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2299275Z test(self, **param_kwargs) 2022-11-23T03:12:19.2299609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2299722Z return func(*args, **kwargs) 2022-11-23T03:12:19.2299946Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2300047Z self.run_subtests( 2022-11-23T03:12:19.2300544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2300740Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2301085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2301222Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2301566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2301672Z output = model(*input) 2022-11-23T03:12:19.2301975Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2302100Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2302641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2302805Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2303158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2303276Z _lazy_init(state, module) 2022-11-23T03:12:19.2303608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2303741Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2304336Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2304452Z return func(*args, **kwargs) 2022-11-23T03:12:19.2304819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2304911Z p_assert( 2022-11-23T03:12:19.2305232Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2305347Z traceback.print_stack() 2022-11-23T03:12:19.2305567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2305799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2306181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2306395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2306509Z File "", line 1, in 2022-11-23T03:12:19.2306699Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2306825Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2307009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2307137Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2307518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2307611Z self.run() 2022-11-23T03:12:19.2307800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2308023Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2308369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2308490Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2308842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2308948Z getattr(self, test_name)() 2022-11-23T03:12:19.2309294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2309381Z fn() 2022-11-23T03:12:19.2309731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2309842Z test(self, **param_kwargs) 2022-11-23T03:12:19.2310184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2310367Z return func(*args, **kwargs) 2022-11-23T03:12:19.2310590Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2310695Z self.run_subtests( 2022-11-23T03:12:19.2311034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2311184Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2311533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2311675Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2312036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2312144Z output = model(*input) 2022-11-23T03:12:19.2312451Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2312588Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2312954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2313117Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2313631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2313737Z _lazy_init(state, module) 2022-11-23T03:12:19.2314239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2314372Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2314696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2314803Z return func(*args, **kwargs) 2022-11-23T03:12:19.2315174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2315266Z p_assert( 2022-11-23T03:12:19.2315591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2315705Z traceback.print_stack() 2022-11-23T03:12:19.2315822Z File "", line 1, in 2022-11-23T03:12:19.2316017Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2316142Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2316329Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2316468Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2316669Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2316760Z self.run() 2022-11-23T03:12:19.2316953Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2317133Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2317468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2317584Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2317932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2318045Z getattr(self, test_name)() 2022-11-23T03:12:19.2318391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2318478Z fn() 2022-11-23T03:12:19.2318829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2318941Z test(self, **param_kwargs) 2022-11-23T03:12:19.2319281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2319442Z return func(*args, **kwargs) 2022-11-23T03:12:19.2319671Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2319774Z self.run_subtests( 2022-11-23T03:12:19.2320116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2320267Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2320616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2320756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2321115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2321216Z output = model(*input) 2022-11-23T03:12:19.2321533Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2321665Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2322029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2322191Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2322700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2322807Z _lazy_init(state, module) 2022-11-23T03:12:19.2323133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2323254Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2323565Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2323675Z return func(*args, **kwargs) 2022-11-23T03:12:19.2324039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2324126Z p_assert( 2022-11-23T03:12:19.2324439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2324548Z traceback.print_stack() 2022-11-23T03:12:19.2324663Z File "", line 1, in 2022-11-23T03:12:19.2324769Z File "", line 1, in 2022-11-23T03:12:19.2324962Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2325089Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2325271Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2325403Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2325593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2325724Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2326134Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2326233Z self.run() 2022-11-23T03:12:19.2326423Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2326560Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2326751Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2326884Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2327085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2327170Z self.run() 2022-11-23T03:12:19.2327497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2327618Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2327808Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2327997Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2328347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2328459Z getattr(self, test_name)() 2022-11-23T03:12:19.2328783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2328895Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2329241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2329329Z fn() 2022-11-23T03:12:19.2329675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2329786Z getattr(self, test_name)() 2022-11-23T03:12:19.2330136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2330251Z test(self, **param_kwargs) 2022-11-23T03:12:19.2330593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2330671Z fn() 2022-11-23T03:12:19.2331015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2331130Z return func(*args, **kwargs) 2022-11-23T03:12:19.2331637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2331743Z test(self, **param_kwargs) 2022-11-23T03:12:19.2331963Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2332062Z self.run_subtests( 2022-11-23T03:12:19.2332397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2332502Z return func(*args, **kwargs) 2022-11-23T03:12:19.2332886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2333037Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2333254Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2333352Z self.run_subtests( 2022-11-23T03:12:19.2333692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2334002Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2334341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2334482Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2334838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2334996Z output = model(*input) 2022-11-23T03:12:19.2335353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2335492Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2335804Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2335934Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2336294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2336394Z output = model(*input) 2022-11-23T03:12:19.2336756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2337083Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2337617Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2337746Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2338099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2338209Z _lazy_init(state, module) 2022-11-23T03:12:19.2338570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2338726Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2339065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2339196Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2339544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2339659Z _lazy_init(state, module) 2022-11-23T03:12:19.2339988Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2340104Z return func(*args, **kwargs) 2022-11-23T03:12:19.2340590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2340710Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2341240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2341333Z p_assert( 2022-11-23T03:12:19.2341657Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2341770Z return func(*args, **kwargs) 2022-11-23T03:12:19.2342093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2342212Z traceback.print_stack() 2022-11-23T03:12:19.2342577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2342660Z p_assert( 2022-11-23T03:12:19.2342979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2343089Z traceback.print_stack() 2022-11-23T03:12:19.2343349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2343597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2343818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2344364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2344492Z File "", line 1, in 2022-11-23T03:12:19.2344683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2344890Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2345091Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2345228Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2345426Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2345521Z self.run() 2022-11-23T03:12:19.2345711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2345849Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2346179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2346300Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2346651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2346827Z getattr(self, test_name)() 2022-11-23T03:12:19.2347177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2347263Z fn() 2022-11-23T03:12:19.2347768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2347870Z test(self, **param_kwargs) 2022-11-23T03:12:19.2348201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2348309Z return func(*args, **kwargs) 2022-11-23T03:12:19.2348713Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2348816Z self.run_subtests( 2022-11-23T03:12:19.2349154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2349303Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2349662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2349797Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2350157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2350265Z output = model(*input) 2022-11-23T03:12:19.2350581Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2350709Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2351077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2351242Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2351597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2351711Z _lazy_init(state, module) 2022-11-23T03:12:19.2352047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2352179Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2352506Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2352619Z return func(*args, **kwargs) 2022-11-23T03:12:19.2352984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2353075Z p_assert( 2022-11-23T03:12:19.2353397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2353503Z traceback.print_stack() 2022-11-23T03:12:19.2353620Z File "", line 1, in 2022-11-23T03:12:19.2353816Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2353998Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2354196Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2354336Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2354534Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2354626Z self.run() 2022-11-23T03:12:19.2354807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2354942Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2355269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2355389Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2355735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2355908Z getattr(self, test_name)() 2022-11-23T03:12:19.2356410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2356494Z fn() 2022-11-23T03:12:19.2356831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2356939Z test(self, **param_kwargs) 2022-11-23T03:12:19.2357271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2357382Z return func(*args, **kwargs) 2022-11-23T03:12:19.2357601Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2357699Z self.run_subtests( 2022-11-23T03:12:19.2358029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2358175Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2358514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2358653Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2359004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2359108Z output = model(*input) 2022-11-23T03:12:19.2359411Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2359536Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2359886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2360045Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2360380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2360493Z _lazy_init(state, module) 2022-11-23T03:12:19.2360998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2361132Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2361460Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2361572Z return func(*args, **kwargs) 2022-11-23T03:12:19.2361936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2362027Z p_assert( 2022-11-23T03:12:19.2362346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2362460Z traceback.print_stack() 2022-11-23T03:12:19.2362577Z File "", line 1, in 2022-11-23T03:12:19.2362774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2362957Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2363154Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2363295Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2363489Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2363583Z self.run() 2022-11-23T03:12:19.2363773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2363908Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2364390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2364510Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2364845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2364999Z getattr(self, test_name)() 2022-11-23T03:12:19.2365331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2365415Z fn() 2022-11-23T03:12:19.2365753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2365860Z test(self, **param_kwargs) 2022-11-23T03:12:19.2366188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2366297Z return func(*args, **kwargs) 2022-11-23T03:12:19.2366514Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2366611Z self.run_subtests( 2022-11-23T03:12:19.2366931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2367078Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2367419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2367556Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2367903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2368008Z output = model(*input) 2022-11-23T03:12:19.2368307Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2368434Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2368777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2368938Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2369280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2369391Z _lazy_init(state, module) 2022-11-23T03:12:19.2369719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2369845Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2370160Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2370269Z return func(*args, **kwargs) 2022-11-23T03:12:19.2370615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2370702Z p_assert( 2022-11-23T03:12:19.2371012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2371122Z traceback.print_stack() 2022-11-23T03:12:19.2371235Z File "", line 1, in 2022-11-23T03:12:19.2371429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2371600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2371781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2372090Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2372292Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2372384Z self.run() 2022-11-23T03:12:19.2372578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2372710Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2373035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2373156Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2373494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2373655Z getattr(self, test_name)() 2022-11-23T03:12:19.2374004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2374090Z fn() 2022-11-23T03:12:19.2374439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2374552Z test(self, **param_kwargs) 2022-11-23T03:12:19.2374892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2375161Z return func(*args, **kwargs) 2022-11-23T03:12:19.2375373Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2375471Z self.run_subtests( 2022-11-23T03:12:19.2375797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2375946Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2376284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2376419Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2376768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2377052Z output = model(*input) 2022-11-23T03:12:19.2377364Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2377488Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2377852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2378015Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2378369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2378487Z _lazy_init(state, module) 2022-11-23T03:12:19.2378828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2378960Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2379276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2379390Z return func(*args, **kwargs) 2022-11-23T03:12:19.2379754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2379844Z p_assert( 2022-11-23T03:12:19.2380167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2380282Z traceback.print_stack() 2022-11-23T03:12:19.2380507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2380783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2381004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2381231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2381350Z File "", line 1, in 2022-11-23T03:12:19.2381548Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2381678Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2381867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2382007Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2382206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2382292Z self.run() 2022-11-23T03:12:19.2382483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2382666Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2382995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2383117Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2383618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2383725Z getattr(self, test_name)() 2022-11-23T03:12:19.2384461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2384553Z fn() 2022-11-23T03:12:19.2384913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2385024Z test(self, **param_kwargs) 2022-11-23T03:12:19.2385362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2385485Z return func(*args, **kwargs) 2022-11-23T03:12:19.2385716Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2385817Z self.run_subtests( 2022-11-23T03:12:19.2386150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2386302Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2386654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2386794Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2387157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2387264Z output = model(*input) 2022-11-23T03:12:19.2387734Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2387865Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2388211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2388369Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2388709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2388814Z _lazy_init(state, module) 2022-11-23T03:12:19.2389140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2389317Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2389636Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2389744Z return func(*args, **kwargs) 2022-11-23T03:12:19.2390164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2390263Z p_assert( 2022-11-23T03:12:19.2390754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2390871Z traceback.print_stack() 2022-11-23T03:12:19.2390989Z File "", line 1, in 2022-11-23T03:12:19.2391186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2391319Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2391510Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2391643Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2391845Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2391937Z self.run() 2022-11-23T03:12:19.2392127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2392331Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2392658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2392780Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2393126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2393232Z getattr(self, test_name)() 2022-11-23T03:12:19.2393737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2394000Z fn() 2022-11-23T03:12:19.2394352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2394467Z test(self, **param_kwargs) 2022-11-23T03:12:19.2394809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2394930Z return func(*args, **kwargs) 2022-11-23T03:12:19.2395154Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2395256Z self.run_subtests( 2022-11-23T03:12:19.2395595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2395744Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2396090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2396230Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2396587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2396695Z output = model(*input) 2022-11-23T03:12:19.2397154Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2397280Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2397630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2397962Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2398314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2398423Z _lazy_init(state, module) 2022-11-23T03:12:19.2398761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2398894Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2399218Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2399323Z return func(*args, **kwargs) 2022-11-23T03:12:19.2399741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2399842Z p_assert( 2022-11-23T03:12:19.2400165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2400279Z traceback.print_stack() 2022-11-23T03:12:19.2400397Z File "", line 1, in 2022-11-23T03:12:19.2400595Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2400717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2401064Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2401198Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2401391Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2401479Z self.run() 2022-11-23T03:12:19.2401662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2401845Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2402163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2402274Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2402793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2402910Z getattr(self, test_name)() 2022-11-23T03:12:19.2403255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2403339Z fn() 2022-11-23T03:12:19.2403689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2403800Z test(self, **param_kwargs) 2022-11-23T03:12:19.2404142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2404257Z return func(*args, **kwargs) 2022-11-23T03:12:19.2404484Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2404585Z self.run_subtests( 2022-11-23T03:12:19.2404927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2405078Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2405427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2405568Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2405929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2406034Z output = model(*input) 2022-11-23T03:12:19.2406345Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2406482Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2406847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2407011Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2407365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2407475Z _lazy_init(state, module) 2022-11-23T03:12:19.2407813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2407936Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2408262Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2408377Z return func(*args, **kwargs) 2022-11-23T03:12:19.2408944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2409038Z p_assert( 2022-11-23T03:12:19.2409537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2409650Z traceback.print_stack() 2022-11-23T03:12:19.2409767Z File "", line 1, in 2022-11-23T03:12:19.2409958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2410088Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2410276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2410415Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2410616Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2410707Z self.run() 2022-11-23T03:12:19.2410945Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2411078Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2411404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2411525Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2411874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2411985Z getattr(self, test_name)() 2022-11-23T03:12:19.2412329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2412415Z fn() 2022-11-23T03:12:19.2412764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2412869Z test(self, **param_kwargs) 2022-11-23T03:12:19.2413211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2413328Z return func(*args, **kwargs) 2022-11-23T03:12:19.2413555Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2413814Z self.run_subtests( 2022-11-23T03:12:19.2414144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2414289Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2414626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2414756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2415104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2415208Z output = model(*input) 2022-11-23T03:12:19.2415688Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2415823Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2416186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2416349Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2416703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2416805Z _lazy_init(state, module) 2022-11-23T03:12:19.2417142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2417278Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2417601Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2417714Z return func(*args, **kwargs) 2022-11-23T03:12:19.2418170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2418269Z p_assert( 2022-11-23T03:12:19.2418595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2418702Z traceback.print_stack() 2022-11-23T03:12:19.2418931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2419154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2419374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2419594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2419714Z File "", line 1, in 2022-11-23T03:12:19.2419912Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2420106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2420295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2420435Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2420635Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2420728Z self.run() 2022-11-23T03:12:19.2420918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2421050Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2421381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2421496Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2421845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2421956Z getattr(self, test_name)() 2022-11-23T03:12:19.2422308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2422395Z fn() 2022-11-23T03:12:19.2422899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2423007Z test(self, **param_kwargs) 2022-11-23T03:12:19.2423336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2423438Z return func(*args, **kwargs) 2022-11-23T03:12:19.2423659Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2423756Z self.run_subtests( 2022-11-23T03:12:19.2424305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2424451Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2424798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2424934Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2425283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2425379Z output = model(*input) 2022-11-23T03:12:19.2425682Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2425805Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2426159Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2426317Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2426657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2426765Z _lazy_init(state, module) 2022-11-23T03:12:19.2427157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2427286Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2427602Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2427712Z return func(*args, **kwargs) 2022-11-23T03:12:19.2428062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2428150Z p_assert( 2022-11-23T03:12:19.2428641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2428756Z traceback.print_stack() 2022-11-23T03:12:19.2428873Z File "", line 1, in 2022-11-23T03:12:19.2429064Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2429264Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2429455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2429594Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2429794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2429886Z self.run() 2022-11-23T03:12:19.2430077Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2430209Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2430527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2430650Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2430998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2431113Z getattr(self, test_name)() 2022-11-23T03:12:19.2431466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2431554Z fn() 2022-11-23T03:12:19.2432066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2432168Z test(self, **param_kwargs) 2022-11-23T03:12:19.2432500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2432609Z return func(*args, **kwargs) 2022-11-23T03:12:19.2432880Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2432980Z self.run_subtests( 2022-11-23T03:12:19.2433307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2433452Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2433797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2433927Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2434456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2434565Z output = model(*input) 2022-11-23T03:12:19.2434878Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2435006Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2435369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2435534Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2435884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2435996Z _lazy_init(state, module) 2022-11-23T03:12:19.2436375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2436514Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2436839Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2436952Z return func(*args, **kwargs) 2022-11-23T03:12:19.2437317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2437572Z p_assert( 2022-11-23T03:12:19.2438060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2438168Z traceback.print_stack() 2022-11-23T03:12:19.2438287Z File "", line 1, in 2022-11-23T03:12:19.2438484Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2438669Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2438858Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2438996Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2439197Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2439288Z self.run() 2022-11-23T03:12:19.2439473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2439605Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2439932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2440053Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2440398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2440512Z getattr(self, test_name)() 2022-11-23T03:12:19.2441190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2441278Z fn() 2022-11-23T03:12:19.2441626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2441737Z test(self, **param_kwargs) 2022-11-23T03:12:19.2442077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2442190Z return func(*args, **kwargs) 2022-11-23T03:12:19.2442419Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2442521Z self.run_subtests( 2022-11-23T03:12:19.2442861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2443010Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2443362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2443501Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2443864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2443970Z output = model(*input) 2022-11-23T03:12:19.2444437Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2444563Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2444913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2445071Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2445404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2445513Z _lazy_init(state, module) 2022-11-23T03:12:19.2445892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2446025Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2446527Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2446639Z return func(*args, **kwargs) 2022-11-23T03:12:19.2447004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2447094Z p_assert( 2022-11-23T03:12:19.2447407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2447521Z traceback.print_stack() 2022-11-23T03:12:19.2447639Z File "", line 1, in 2022-11-23T03:12:19.2447834Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2448015Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2448204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2448342Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2448533Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2448626Z self.run() 2022-11-23T03:12:19.2448815Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2448948Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2449273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2449395Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2449745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2449861Z getattr(self, test_name)() 2022-11-23T03:12:19.2450203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2450291Z fn() 2022-11-23T03:12:19.2450645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2450759Z test(self, **param_kwargs) 2022-11-23T03:12:19.2451102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2451215Z return func(*args, **kwargs) 2022-11-23T03:12:19.2451441Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2451542Z self.run_subtests( 2022-11-23T03:12:19.2451874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2452024Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2452380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2452522Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2452882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2452989Z output = model(*input) 2022-11-23T03:12:19.2453300Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2453429Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2453784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2453947Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2454298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2454454Z _lazy_init(state, module) 2022-11-23T03:12:19.2454802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2454935Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2455260Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2455374Z return func(*args, **kwargs) 2022-11-23T03:12:19.2455893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2455982Z p_assert( 2022-11-23T03:12:19.2456292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2456401Z traceback.print_stack() 2022-11-23T03:12:19.2456618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2456883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2457096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2457307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2457511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2457718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2457924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2458130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2458336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2458543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2458939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2459154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2459900Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2460630Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2461356Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2462089Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2462807Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2463580Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2464535Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2465263Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2465490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2465815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2466037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2466255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2466470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2466684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2466898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2467111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2467319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2467688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2467899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2468103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2468814Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2469677Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2470405Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2471131Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2471848Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2472625Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2473357Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2474073Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2474795Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2475561Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2476415Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2477107Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2477996Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2478712Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2479427Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2480144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2480855Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2481612Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2482336Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2483188Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2483405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2483689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2483902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2484107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2484316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2484524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2484732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2484939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2485146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2485352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2485567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2485765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2486660Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2487382Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2488106Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2488823Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2489751Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2490495Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2491199Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2491889Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2492104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2492364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2492575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2492784Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2492991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2493196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2493404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2493604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2493810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2494020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2494411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2494627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2495359Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2496079Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2496809Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2497678Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2498559Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2499327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2500051Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2500765Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2501674Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2502364Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2503237Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2504165Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2504896Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2505606Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2506327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2507040Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2507908Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2508666Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2509365Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2510250Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2510535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2510761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2510974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2511192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2511409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2511623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2511838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2512052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2512264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2512482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2512691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2512904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2513637Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2514503Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2515205Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2516071Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2516787Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2517549Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2518271Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2519149Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2519409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2519620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2519832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2520043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2520250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2520457Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2520660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2520866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2521072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2521285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2521489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2521696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2522390Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2523086Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2523960Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2524680Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2525393Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2526153Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2527034Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2527732Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2528477Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2529168Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2530056Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2530776Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2531486Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2532363Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2533109Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2533806Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2534489Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2535425Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2536144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2536858Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2537122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2537343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2537561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2537945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2538324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2538539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2538753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2538969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2539180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2539394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2539607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2539823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2540548Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2541599Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2542323Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2543040Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2543750Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2544918Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2545622Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2546311Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.2546595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2546808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2547019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2547229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2547431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2547639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2547845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2548983Z dist init r=2, world=4 2022-11-23T03:12:19.2549460Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2549768Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2550066Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2550365Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2550658Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2550947Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2551236Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2551524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2551856Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2552153Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2552442Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2552729Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.2552822Z dist init r=1, world=4 2022-11-23T03:12:19.2553134Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2553438Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2553824Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2554117Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2554408Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2554697Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2554988Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2555283Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2555572Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2555860Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2556303Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2556580Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.2556681Z dist init r=3, world=4 2022-11-23T03:12:19.2556983Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2557274Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2557557Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2557838Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2558118Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2558444Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2558732Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2559011Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2559282Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2559559Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2559841Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2560169Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.2560265Z dist init r=0, world=4 2022-11-23T03:12:19.2560564Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2560854Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2561322Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2561616Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2561910Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2562198Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2562480Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2562768Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2563054Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2563349Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2563637Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2563922Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.2564011Z ok (10.732s) 2022-11-23T03:12:19.2564334Z test_transformer_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29182 2022-11-23T03:12:19.2564540Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29183 2022-11-23T03:12:19.2564906Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29184 2022-11-23T03:12:19.2565148Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29185 2022-11-23T03:12:19.2565507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.2565666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.2566138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.2566315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.2566657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.2566816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.2567168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.2567395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.2567733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.2567890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.2568240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.2568412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.2568750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.2568906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.2569256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.2569430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.2569658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.2569877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.2570100Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.2570320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.2570694Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.2571066Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.2571426Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.2571793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.2572004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.2572211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.2572412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.2572790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.2573018Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2573237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2573456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2573677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2574727Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.2574836Z warnings.warn( 2022-11-23T03:12:19.2575836Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.2575983Z warnings.warn( 2022-11-23T03:12:19.2576990Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.2577090Z warnings.warn( 2022-11-23T03:12:19.2578095Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.2578192Z warnings.warn( 2022-11-23T03:12:19.2578313Z File "", line 1, in 2022-11-23T03:12:19.2578516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2578646Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2578838Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2578976Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2579176Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2579269Z self.run() 2022-11-23T03:12:19.2579454Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2579588Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2579917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2580044Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2580394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2580504Z getattr(self, test_name)() 2022-11-23T03:12:19.2580855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2580943Z fn() 2022-11-23T03:12:19.2581292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2581403Z test(self, **param_kwargs) 2022-11-23T03:12:19.2581746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2581858Z return func(*args, **kwargs) 2022-11-23T03:12:19.2582085Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2582236Z self.run_subtests( 2022-11-23T03:12:19.2582590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2582733Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2583086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2583227Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2583588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2583696Z output = model(*input) 2022-11-23T03:12:19.2584213Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2584350Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2584801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2584966Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2585317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2585432Z _lazy_init(state, module) 2022-11-23T03:12:19.2585772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2585904Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2586229Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2586341Z return func(*args, **kwargs) 2022-11-23T03:12:19.2586709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2586805Z p_assert( 2022-11-23T03:12:19.2587128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2587243Z traceback.print_stack() 2022-11-23T03:12:19.2587361Z File "", line 1, in 2022-11-23T03:12:19.2587560Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2587690Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2587879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2588019Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2588215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2588307Z self.run() 2022-11-23T03:12:19.2588498Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2588631Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2588964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2589090Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2589493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2589605Z getattr(self, test_name)() 2022-11-23T03:12:19.2589946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2590038Z fn() 2022-11-23T03:12:19.2590388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2590499Z test(self, **param_kwargs) 2022-11-23T03:12:19.2590997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2591099Z return func(*args, **kwargs) 2022-11-23T03:12:19.2591318Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2591478Z self.run_subtests( 2022-11-23T03:12:19.2591817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2591962Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2592300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2592434Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2592783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2592880Z output = model(*input) 2022-11-23T03:12:19.2593181Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2593304Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2593709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2593867Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2594208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2594312Z _lazy_init(state, module) 2022-11-23T03:12:19.2594818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2594949Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2595265Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2595377Z return func(*args, **kwargs) 2022-11-23T03:12:19.2595740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2595835Z p_assert( 2022-11-23T03:12:19.2596165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2596281Z traceback.print_stack() 2022-11-23T03:12:19.2596398Z File "", line 1, in 2022-11-23T03:12:19.2596590Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2596720Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2596908Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2597048Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2597248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2597340Z self.run() 2022-11-23T03:12:19.2597529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2597663Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2597989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2598115Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2598465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2598577Z getattr(self, test_name)() 2022-11-23T03:12:19.2598921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2599006Z fn() 2022-11-23T03:12:19.2599355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2599468Z test(self, **param_kwargs) 2022-11-23T03:12:19.2599803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2599915Z return func(*args, **kwargs) 2022-11-23T03:12:19.2600143Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2600293Z self.run_subtests( 2022-11-23T03:12:19.2600643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2600793Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2601145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2601283Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2601793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2601897Z output = model(*input) 2022-11-23T03:12:19.2602196Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2602319Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2602741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2602900Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2603243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2603528Z _lazy_init(state, module) 2022-11-23T03:12:19.2603861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2603993Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2604317Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2604429Z return func(*args, **kwargs) 2022-11-23T03:12:19.2604795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2604890Z p_assert( 2022-11-23T03:12:19.2605218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2605335Z traceback.print_stack() 2022-11-23T03:12:19.2605446Z File "", line 1, in 2022-11-23T03:12:19.2605644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2605774Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2605963Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2606102Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2606301Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2606392Z self.run() 2022-11-23T03:12:19.2606732Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2606864Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2607189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2607307Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2607822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2607934Z getattr(self, test_name)() 2022-11-23T03:12:19.2608278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2608364Z fn() 2022-11-23T03:12:19.2608708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2608819Z test(self, **param_kwargs) 2022-11-23T03:12:19.2609162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2609274Z return func(*args, **kwargs) 2022-11-23T03:12:19.2609504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2609655Z self.run_subtests( 2022-11-23T03:12:19.2610003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2610154Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2610497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2610636Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2610996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2611102Z output = model(*input) 2022-11-23T03:12:19.2611414Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2611541Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2611962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2612126Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2612475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2612585Z _lazy_init(state, module) 2022-11-23T03:12:19.2612923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2613054Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2613379Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2613491Z return func(*args, **kwargs) 2022-11-23T03:12:19.2613856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2613953Z p_assert( 2022-11-23T03:12:19.2614274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2614390Z traceback.print_stack() 2022-11-23T03:12:19.2614615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2614839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2615061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2615281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2615398Z File "", line 1, in 2022-11-23T03:12:19.2615596Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2615721Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2615909Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2616054Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2616258Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2616349Z self.run() 2022-11-23T03:12:19.2616539Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2616671Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2616997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2617120Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2617470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2617579Z getattr(self, test_name)() 2022-11-23T03:12:19.2617926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2618016Z fn() 2022-11-23T03:12:19.2618410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2618528Z test(self, **param_kwargs) 2022-11-23T03:12:19.2618867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2618979Z return func(*args, **kwargs) 2022-11-23T03:12:19.2619205Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2619305Z self.run_subtests( 2022-11-23T03:12:19.2619642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2619793Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2620298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2620482Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2620830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2620935Z output = model(*input) 2022-11-23T03:12:19.2621238Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2621364Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2621714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2621872Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2622212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2622317Z _lazy_init(state, module) 2022-11-23T03:12:19.2622636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2622772Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2623088Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2623196Z return func(*args, **kwargs) 2022-11-23T03:12:19.2623548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2623636Z p_assert( 2022-11-23T03:12:19.2624297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2624593Z traceback.print_stack() 2022-11-23T03:12:19.2624708Z File "", line 1, in 2022-11-23T03:12:19.2624906Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2625037Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2625228Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2625374Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2625575Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2625667Z self.run() 2022-11-23T03:12:19.2625851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2625985Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2626318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2626438Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2626787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2626898Z getattr(self, test_name)() 2022-11-23T03:12:19.2627242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2627332Z fn() 2022-11-23T03:12:19.2627750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2627871Z test(self, **param_kwargs) 2022-11-23T03:12:19.2628219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2628329Z return func(*args, **kwargs) 2022-11-23T03:12:19.2628556Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2628657Z self.run_subtests( 2022-11-23T03:12:19.2628996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2629313Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2629826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2630034Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2630405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2630515Z output = model(*input) 2022-11-23T03:12:19.2630830Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2630961Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2631325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2631491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2631840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2631955Z _lazy_init(state, module) 2022-11-23T03:12:19.2632295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2632435Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2632968Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2633081Z return func(*args, **kwargs) 2022-11-23T03:12:19.2633441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2633531Z p_assert( 2022-11-23T03:12:19.2633838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2633954Z traceback.print_stack() 2022-11-23T03:12:19.2634072Z File "", line 1, in 2022-11-23T03:12:19.2634264Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2634395Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2634578Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2634722Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2634918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2635179Z self.run() 2022-11-23T03:12:19.2635375Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2635512Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2635844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2635966Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2636316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2636432Z getattr(self, test_name)() 2022-11-23T03:12:19.2636779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2636867Z fn() 2022-11-23T03:12:19.2637280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2637401Z test(self, **param_kwargs) 2022-11-23T03:12:19.2637750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2637864Z return func(*args, **kwargs) 2022-11-23T03:12:19.2638250Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2638350Z self.run_subtests( 2022-11-23T03:12:19.2638849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2639004Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2639354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2639548Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2639918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2640030Z output = model(*input) 2022-11-23T03:12:19.2640343Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2640477Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2640843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2641003Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2641359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2641630Z _lazy_init(state, module) 2022-11-23T03:12:19.2642146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2642287Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2642617Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2642732Z return func(*args, **kwargs) 2022-11-23T03:12:19.2643101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2643188Z p_assert( 2022-11-23T03:12:19.2643586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2643705Z traceback.print_stack() 2022-11-23T03:12:19.2643825Z File "", line 1, in 2022-11-23T03:12:19.2644024Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2644156Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2644349Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2644492Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2644694Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2644789Z self.run() 2022-11-23T03:12:19.2644983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2645118Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2645596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2645715Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2646054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2646159Z getattr(self, test_name)() 2022-11-23T03:12:19.2646490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2646581Z fn() 2022-11-23T03:12:19.2646975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2647092Z test(self, **param_kwargs) 2022-11-23T03:12:19.2647429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2647539Z return func(*args, **kwargs) 2022-11-23T03:12:19.2647761Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2647855Z self.run_subtests( 2022-11-23T03:12:19.2648187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2648334Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2648672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2649030Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2649401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2649512Z output = model(*input) 2022-11-23T03:12:19.2649826Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2649951Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2650313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2650477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2650833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2650945Z _lazy_init(state, module) 2022-11-23T03:12:19.2651284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2651427Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2651756Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2651865Z return func(*args, **kwargs) 2022-11-23T03:12:19.2652231Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2652324Z p_assert( 2022-11-23T03:12:19.2652650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2652766Z traceback.print_stack() 2022-11-23T03:12:19.2652993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2653219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2653441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2653665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2653787Z File "", line 1, in 2022-11-23T03:12:19.2653989Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2654124Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2654317Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2654460Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2654663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2654751Z self.run() 2022-11-23T03:12:19.2654944Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2655082Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2655414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2655587Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2655949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2656063Z getattr(self, test_name)() 2022-11-23T03:12:19.2656413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2656649Z fn() 2022-11-23T03:12:19.2656992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2657104Z test(self, **param_kwargs) 2022-11-23T03:12:19.2657437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2657548Z return func(*args, **kwargs) 2022-11-23T03:12:19.2657770Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2657922Z self.run_subtests( 2022-11-23T03:12:19.2658256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2658399Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2658742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2658880Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2659231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2659337Z output = model(*input) 2022-11-23T03:12:19.2659640Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2659767Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2660121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2660280Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2660623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2660736Z _lazy_init(state, module) 2022-11-23T03:12:19.2661064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2661194Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2661509Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2661619Z return func(*args, **kwargs) 2022-11-23T03:12:19.2662152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2662240Z p_assert( 2022-11-23T03:12:19.2662571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2662687Z traceback.print_stack() 2022-11-23T03:12:19.2662807Z File "", line 1, in 2022-11-23T03:12:19.2663009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2663146Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2663339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2663481Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2663678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2663775Z self.run() 2022-11-23T03:12:19.2664206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2664350Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2664686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2664883Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2665251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2665358Z getattr(self, test_name)() 2022-11-23T03:12:19.2665710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2665798Z fn() 2022-11-23T03:12:19.2666150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2666264Z test(self, **param_kwargs) 2022-11-23T03:12:19.2666606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2666722Z return func(*args, **kwargs) 2022-11-23T03:12:19.2666951Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2667131Z self.run_subtests( 2022-11-23T03:12:19.2667478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2667632Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2667984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2668128Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2668492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2668602Z output = model(*input) 2022-11-23T03:12:19.2669078Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2669203Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2669563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2669727Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2670073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2670181Z _lazy_init(state, module) 2022-11-23T03:12:19.2670513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2670643Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2670959Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2671065Z return func(*args, **kwargs) 2022-11-23T03:12:19.2671421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2671515Z p_assert( 2022-11-23T03:12:19.2671831Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2671946Z traceback.print_stack() 2022-11-23T03:12:19.2672063Z File "", line 1, in 2022-11-23T03:12:19.2672259Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2672387Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2672568Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2672703Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2672898Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2673166Z self.run() 2022-11-23T03:12:19.2673357Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2673493Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2673820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2673992Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2674347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2674465Z getattr(self, test_name)() 2022-11-23T03:12:19.2674815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2674902Z fn() 2022-11-23T03:12:19.2675253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2675370Z test(self, **param_kwargs) 2022-11-23T03:12:19.2675713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2675822Z return func(*args, **kwargs) 2022-11-23T03:12:19.2676051Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2676361Z self.run_subtests( 2022-11-23T03:12:19.2676692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2676839Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2677178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2677317Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2677670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2677770Z output = model(*input) 2022-11-23T03:12:19.2678253Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2678387Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2678759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2678928Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2679284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2679399Z _lazy_init(state, module) 2022-11-23T03:12:19.2679738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2679873Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2680194Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2680312Z return func(*args, **kwargs) 2022-11-23T03:12:19.2680682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2680779Z p_assert( 2022-11-23T03:12:19.2681109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2681228Z traceback.print_stack() 2022-11-23T03:12:19.2681348Z File "", line 1, in 2022-11-23T03:12:19.2681544Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2681676Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2681869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2682011Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2682215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2682309Z self.run() 2022-11-23T03:12:19.2682504Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2682641Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2682968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2683137Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2683660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2683771Z getattr(self, test_name)() 2022-11-23T03:12:19.2684105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2684190Z fn() 2022-11-23T03:12:19.2684707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2684822Z test(self, **param_kwargs) 2022-11-23T03:12:19.2685158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2685274Z return func(*args, **kwargs) 2022-11-23T03:12:19.2685503Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2685659Z self.run_subtests( 2022-11-23T03:12:19.2686001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2686154Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2686504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2686648Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2687006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2687118Z output = model(*input) 2022-11-23T03:12:19.2687583Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2687713Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2688074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2688236Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2688576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2688684Z _lazy_init(state, module) 2022-11-23T03:12:19.2689003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2689134Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2689503Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2689615Z return func(*args, **kwargs) 2022-11-23T03:12:19.2689972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2690065Z p_assert( 2022-11-23T03:12:19.2690383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2690496Z traceback.print_stack() 2022-11-23T03:12:19.2690710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2690927Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2691325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2691548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2691671Z File "", line 1, in 2022-11-23T03:12:19.2691871Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2692006Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2692192Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2692338Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2692625Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2692727Z self.run() 2022-11-23T03:12:19.2692920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2693057Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2693390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2693514Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2693859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2694137Z getattr(self, test_name)() 2022-11-23T03:12:19.2694474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2694605Z fn() 2022-11-23T03:12:19.2695132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2695246Z test(self, **param_kwargs) 2022-11-23T03:12:19.2695591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2695709Z return func(*args, **kwargs) 2022-11-23T03:12:19.2695931Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2696035Z self.run_subtests( 2022-11-23T03:12:19.2696374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2696528Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2696878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2697025Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2697389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2697501Z output = model(*input) 2022-11-23T03:12:19.2697811Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2697942Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2698460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2698621Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2699147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2699258Z _lazy_init(state, module) 2022-11-23T03:12:19.2699598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2699737Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2700061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2700178Z return func(*args, **kwargs) 2022-11-23T03:12:19.2700548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2700640Z p_assert( 2022-11-23T03:12:19.2700964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2701082Z traceback.print_stack() 2022-11-23T03:12:19.2701203Z File "", line 1, in 2022-11-23T03:12:19.2701402Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2701529Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2701721Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2701865Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2702269Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2702368Z self.run() 2022-11-23T03:12:19.2702558Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2702692Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2703004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2703126Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2703462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2703572Z getattr(self, test_name)() 2022-11-23T03:12:19.2704339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2704499Z fn() 2022-11-23T03:12:19.2704866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2704981Z test(self, **param_kwargs) 2022-11-23T03:12:19.2705324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2705439Z return func(*args, **kwargs) 2022-11-23T03:12:19.2705671Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2705774Z self.run_subtests( 2022-11-23T03:12:19.2706115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2706267Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2706617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2706762Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2707124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2707234Z output = model(*input) 2022-11-23T03:12:19.2707550Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2707681Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2708048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2708215Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2708568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2708681Z _lazy_init(state, module) 2022-11-23T03:12:19.2709173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2709310Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2709631Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2709744Z return func(*args, **kwargs) 2022-11-23T03:12:19.2710101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2710192Z p_assert( 2022-11-23T03:12:19.2710922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2711055Z traceback.print_stack() 2022-11-23T03:12:19.2711172Z File "", line 1, in 2022-11-23T03:12:19.2711372Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2711503Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2711697Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2711909Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2712122Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2712215Z self.run() 2022-11-23T03:12:19.2712409Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2712539Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2712872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2712996Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2713344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2713457Z getattr(self, test_name)() 2022-11-23T03:12:19.2713806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2713942Z fn() 2022-11-23T03:12:19.2714294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2714411Z test(self, **param_kwargs) 2022-11-23T03:12:19.2714915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2715030Z return func(*args, **kwargs) 2022-11-23T03:12:19.2715251Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2715352Z self.run_subtests( 2022-11-23T03:12:19.2715681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2715829Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2716164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2716307Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2716841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2716954Z output = model(*input) 2022-11-23T03:12:19.2717271Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2717402Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2717766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2717932Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2718287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2718392Z _lazy_init(state, module) 2022-11-23T03:12:19.2718734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2718875Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2719207Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2719321Z return func(*args, **kwargs) 2022-11-23T03:12:19.2719846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2719938Z p_assert( 2022-11-23T03:12:19.2720245Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2720361Z traceback.print_stack() 2022-11-23T03:12:19.2720475Z File "", line 1, in 2022-11-23T03:12:19.2720669Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2720796Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2720981Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2721168Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2721371Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2721456Z self.run() 2022-11-23T03:12:19.2721643Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2721775Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2722094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2722213Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2722549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2722658Z getattr(self, test_name)() 2022-11-23T03:12:19.2722992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2723117Z fn() 2022-11-23T03:12:19.2723465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2723573Z test(self, **param_kwargs) 2022-11-23T03:12:19.2723907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2724021Z return func(*args, **kwargs) 2022-11-23T03:12:19.2724242Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2724342Z self.run_subtests( 2022-11-23T03:12:19.2724665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2724816Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2725160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2725302Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2725656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2725763Z output = model(*input) 2022-11-23T03:12:19.2726066Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2726198Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2726549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2726704Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2727044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2727153Z _lazy_init(state, module) 2022-11-23T03:12:19.2727482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2727618Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2727936Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2728047Z return func(*args, **kwargs) 2022-11-23T03:12:19.2728401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2728485Z p_assert( 2022-11-23T03:12:19.2728800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2728913Z traceback.print_stack() 2022-11-23T03:12:19.2729134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2729351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2729564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2729826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2729948Z File "", line 1, in 2022-11-23T03:12:19.2730322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2730458Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2730651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2730792Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2730995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2731090Z self.run() 2022-11-23T03:12:19.2731282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2731414Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2731747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2731935Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2732288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2732403Z getattr(self, test_name)() 2022-11-23T03:12:19.2732751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2732841Z fn() 2022-11-23T03:12:19.2733390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2733494Z test(self, **param_kwargs) 2022-11-23T03:12:19.2733827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2733939Z return func(*args, **kwargs) 2022-11-23T03:12:19.2734158Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2734263Z self.run_subtests( 2022-11-23T03:12:19.2734594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2734743Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2735080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2735212Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2735748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2735859Z output = model(*input) 2022-11-23T03:12:19.2736173Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2736305Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2736671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2736849Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2737203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2737309Z _lazy_init(state, module) 2022-11-23T03:12:19.2737649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2737784Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2738112Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2738228Z return func(*args, **kwargs) 2022-11-23T03:12:19.2738756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2739014Z p_assert( 2022-11-23T03:12:19.2739394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2739509Z traceback.print_stack() 2022-11-23T03:12:19.2739630Z File "", line 1, in 2022-11-23T03:12:19.2739831Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2739963Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2740154Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2740297Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2740499Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2740587Z self.run() 2022-11-23T03:12:19.2740782Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2740919Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2741249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2741426Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2741780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2741895Z getattr(self, test_name)() 2022-11-23T03:12:19.2742244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2742326Z fn() 2022-11-23T03:12:19.2742679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2742797Z test(self, **param_kwargs) 2022-11-23T03:12:19.2743143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2743259Z return func(*args, **kwargs) 2022-11-23T03:12:19.2743487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2743598Z self.run_subtests( 2022-11-23T03:12:19.2744173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2744332Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2744691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2744836Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2745196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2745306Z output = model(*input) 2022-11-23T03:12:19.2745620Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2745751Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2746116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2746282Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2746644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2746754Z _lazy_init(state, module) 2022-11-23T03:12:19.2747097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2747231Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2747557Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2747670Z return func(*args, **kwargs) 2022-11-23T03:12:19.2748037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2748123Z p_assert( 2022-11-23T03:12:19.2748522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2748647Z traceback.print_stack() 2022-11-23T03:12:19.2748768Z File "", line 1, in 2022-11-23T03:12:19.2748966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2749100Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2749294Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2749593Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2749783Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2749875Z self.run() 2022-11-23T03:12:19.2750236Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2750372Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2750701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2750888Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2751241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2751349Z getattr(self, test_name)() 2022-11-23T03:12:19.2751699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2751787Z fn() 2022-11-23T03:12:19.2752137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2752250Z test(self, **param_kwargs) 2022-11-23T03:12:19.2752591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2752706Z return func(*args, **kwargs) 2022-11-23T03:12:19.2752936Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2753041Z self.run_subtests( 2022-11-23T03:12:19.2753384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2753537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2753889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2754032Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2754396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2754507Z output = model(*input) 2022-11-23T03:12:19.2754824Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2754949Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2755322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2755491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2755846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2755958Z _lazy_init(state, module) 2022-11-23T03:12:19.2756297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2756432Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2756919Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2757025Z return func(*args, **kwargs) 2022-11-23T03:12:19.2757380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2757471Z p_assert( 2022-11-23T03:12:19.2757837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2757958Z traceback.print_stack() 2022-11-23T03:12:19.2758074Z File "", line 1, in 2022-11-23T03:12:19.2758268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2758397Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2758575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2758714Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2758908Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2759000Z self.run() 2022-11-23T03:12:19.2759371Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2759510Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2759844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2760019Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2760365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2760478Z getattr(self, test_name)() 2022-11-23T03:12:19.2760825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2760913Z fn() 2022-11-23T03:12:19.2761267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2761380Z test(self, **param_kwargs) 2022-11-23T03:12:19.2761725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2761834Z return func(*args, **kwargs) 2022-11-23T03:12:19.2762399Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2762511Z self.run_subtests( 2022-11-23T03:12:19.2762858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2763012Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2763364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2763505Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2763869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2763978Z output = model(*input) 2022-11-23T03:12:19.2764287Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2764421Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2764796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2764963Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2765479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2765589Z _lazy_init(state, module) 2022-11-23T03:12:19.2765918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2766046Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2766356Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2766469Z return func(*args, **kwargs) 2022-11-23T03:12:19.2766822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2766916Z p_assert( 2022-11-23T03:12:19.2767275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2767397Z traceback.print_stack() 2022-11-23T03:12:19.2767615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2767827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2768046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2768259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2768377Z File "", line 1, in 2022-11-23T03:12:19.2768571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2768699Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2768886Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2769073Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2769266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2769360Z self.run() 2022-11-23T03:12:19.2769545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2769678Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2770176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2770301Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2770654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2770769Z getattr(self, test_name)() 2022-11-23T03:12:19.2771111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2771200Z fn() 2022-11-23T03:12:19.2771567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2771682Z test(self, **param_kwargs) 2022-11-23T03:12:19.2772027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2772142Z return func(*args, **kwargs) 2022-11-23T03:12:19.2772371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2772474Z self.run_subtests( 2022-11-23T03:12:19.2772811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2773288Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2773643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2773787Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2774159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2774270Z output = model(*input) 2022-11-23T03:12:19.2774586Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2774718Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2775076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2775241Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2775599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2775711Z _lazy_init(state, module) 2022-11-23T03:12:19.2776050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2776189Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2776718Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2776836Z return func(*args, **kwargs) 2022-11-23T03:12:19.2777185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2777274Z p_assert( 2022-11-23T03:12:19.2777587Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2777698Z traceback.print_stack() 2022-11-23T03:12:19.2777814Z File "", line 1, in 2022-11-23T03:12:19.2778008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2778136Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2778500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2778686Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2778895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2778989Z self.run() 2022-11-23T03:12:19.2779180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2779317Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2779649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2779773Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2780117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2780233Z getattr(self, test_name)() 2022-11-23T03:12:19.2780582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2780675Z fn() 2022-11-23T03:12:19.2781031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2781146Z test(self, **param_kwargs) 2022-11-23T03:12:19.2781490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2781605Z return func(*args, **kwargs) 2022-11-23T03:12:19.2781827Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2781934Z self.run_subtests( 2022-11-23T03:12:19.2782277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2782429Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2782781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2782928Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2783298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2783408Z output = model(*input) 2022-11-23T03:12:19.2784254Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2784397Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2784770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2784936Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2785291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2785403Z _lazy_init(state, module) 2022-11-23T03:12:19.2785743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2785879Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2786267Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2786396Z return func(*args, **kwargs) 2022-11-23T03:12:19.2786771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2786865Z p_assert( 2022-11-23T03:12:19.2787192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2787308Z traceback.print_stack() 2022-11-23T03:12:19.2787428Z File "", line 1, in 2022-11-23T03:12:19.2787629Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2787756Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2787948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2788152Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2788358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2788453Z self.run() 2022-11-23T03:12:19.2788647Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2788785Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2789110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2789233Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2789635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2789751Z getattr(self, test_name)() 2022-11-23T03:12:19.2790258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2790347Z fn() 2022-11-23T03:12:19.2790690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2790804Z test(self, **param_kwargs) 2022-11-23T03:12:19.2791131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2791243Z return func(*args, **kwargs) 2022-11-23T03:12:19.2791463Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2791563Z self.run_subtests( 2022-11-23T03:12:19.2791890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2792038Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2792375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2792519Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2792866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2792974Z output = model(*input) 2022-11-23T03:12:19.2793278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2793404Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2793755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2793915Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2794257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2794365Z _lazy_init(state, module) 2022-11-23T03:12:19.2794688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2794872Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2795202Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2795491Z return func(*args, **kwargs) 2022-11-23T03:12:19.2795864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2795957Z p_assert( 2022-11-23T03:12:19.2796282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2796400Z traceback.print_stack() 2022-11-23T03:12:19.2796513Z File "", line 1, in 2022-11-23T03:12:19.2796714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2796847Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2797039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2797244Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2797452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2797549Z self.run() 2022-11-23T03:12:19.2797744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2797873Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2798204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2798329Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2798829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2798943Z getattr(self, test_name)() 2022-11-23T03:12:19.2799451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2799545Z fn() 2022-11-23T03:12:19.2799897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2800014Z test(self, **param_kwargs) 2022-11-23T03:12:19.2800364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2800480Z return func(*args, **kwargs) 2022-11-23T03:12:19.2800710Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2800814Z self.run_subtests( 2022-11-23T03:12:19.2801156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2801309Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2801652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2801799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2802164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2802274Z output = model(*input) 2022-11-23T03:12:19.2802742Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2802869Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2803221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2803382Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2803725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2803827Z _lazy_init(state, module) 2022-11-23T03:12:19.2804339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2804525Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2804864Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2804979Z return func(*args, **kwargs) 2022-11-23T03:12:19.2805346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2805438Z p_assert( 2022-11-23T03:12:19.2805755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2805872Z traceback.print_stack() 2022-11-23T03:12:19.2806100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2806326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2806547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2806820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2806941Z File "", line 1, in 2022-11-23T03:12:19.2807299Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2807424Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2807610Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2807747Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2807942Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2808033Z self.run() 2022-11-23T03:12:19.2808219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2808349Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2808674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2808793Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2809134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2809247Z getattr(self, test_name)() 2022-11-23T03:12:19.2809586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2809672Z fn() 2022-11-23T03:12:19.2810015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2810124Z test(self, **param_kwargs) 2022-11-23T03:12:19.2810456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2810562Z return func(*args, **kwargs) 2022-11-23T03:12:19.2810965Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2811073Z self.run_subtests( 2022-11-23T03:12:19.2811419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2811572Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2811924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2812067Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2812432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2812539Z output = model(*input) 2022-11-23T03:12:19.2812852Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2812986Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2813351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2813567Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2813934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2814047Z _lazy_init(state, module) 2022-11-23T03:12:19.2814386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2814513Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2814837Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2814953Z return func(*args, **kwargs) 2022-11-23T03:12:19.2815472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2815564Z p_assert( 2022-11-23T03:12:19.2815940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2816055Z traceback.print_stack() 2022-11-23T03:12:19.2816171Z File "", line 1, in 2022-11-23T03:12:19.2816359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2816490Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2816676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2816814Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2817185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2817280Z self.run() 2022-11-23T03:12:19.2817474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2817605Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2817936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2818066Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2818419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2818534Z getattr(self, test_name)() 2022-11-23T03:12:19.2818880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2818967Z fn() 2022-11-23T03:12:19.2819319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2819426Z test(self, **param_kwargs) 2022-11-23T03:12:19.2819770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2820048Z return func(*args, **kwargs) 2022-11-23T03:12:19.2820273Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2820377Z self.run_subtests( 2022-11-23T03:12:19.2820714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2820864Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2821204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2821337Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2821689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2821795Z output = model(*input) 2022-11-23T03:12:19.2822098Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2822225Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2822579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2822792Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2823149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2823251Z _lazy_init(state, module) 2022-11-23T03:12:19.2823579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2823709Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2824423Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2824545Z return func(*args, **kwargs) 2022-11-23T03:12:19.2824916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2825012Z p_assert( 2022-11-23T03:12:19.2825421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2825533Z traceback.print_stack() 2022-11-23T03:12:19.2825653Z File "", line 1, in 2022-11-23T03:12:19.2825853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2825986Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2826177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2826319Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2826522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2826610Z self.run() 2022-11-23T03:12:19.2826807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2826944Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2827270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2827402Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2827751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2827864Z getattr(self, test_name)() 2022-11-23T03:12:19.2828365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2828447Z fn() 2022-11-23T03:12:19.2828787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2828896Z test(self, **param_kwargs) 2022-11-23T03:12:19.2829231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2829346Z return func(*args, **kwargs) 2022-11-23T03:12:19.2829566Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2829672Z self.run_subtests( 2022-11-23T03:12:19.2830008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2830150Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2830675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2830818Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2831187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2831296Z output = model(*input) 2022-11-23T03:12:19.2831611Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2831742Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2832106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2832330Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2832698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2832815Z _lazy_init(state, module) 2022-11-23T03:12:19.2833203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2833500Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2833819Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2833928Z return func(*args, **kwargs) 2022-11-23T03:12:19.2834281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2834364Z p_assert( 2022-11-23T03:12:19.2834770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2834884Z traceback.print_stack() 2022-11-23T03:12:19.2835000Z File "", line 1, in 2022-11-23T03:12:19.2835194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2835325Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2835509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2835645Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2836016Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2836113Z self.run() 2022-11-23T03:12:19.2836308Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2836445Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2836773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2836905Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2837257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2837365Z getattr(self, test_name)() 2022-11-23T03:12:19.2837713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2837801Z fn() 2022-11-23T03:12:19.2838156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2838268Z test(self, **param_kwargs) 2022-11-23T03:12:19.2838610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2838726Z return func(*args, **kwargs) 2022-11-23T03:12:19.2839117Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2839214Z self.run_subtests( 2022-11-23T03:12:19.2839731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2839884Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2840239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2840382Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2840745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2840856Z output = model(*input) 2022-11-23T03:12:19.2841171Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2841297Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2841665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2841882Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2842251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2842363Z _lazy_init(state, module) 2022-11-23T03:12:19.2842706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2842841Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2843168Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2843283Z return func(*args, **kwargs) 2022-11-23T03:12:19.2843645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2843739Z p_assert( 2022-11-23T03:12:19.2844122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2844238Z traceback.print_stack() 2022-11-23T03:12:19.2844465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2844691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2844914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2845130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2845252Z File "", line 1, in 2022-11-23T03:12:19.2845455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2845587Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2845939Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2846081Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2846280Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2846371Z self.run() 2022-11-23T03:12:19.2846551Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2846682Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2847000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2847120Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2847460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2847571Z getattr(self, test_name)() 2022-11-23T03:12:19.2847909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2847995Z fn() 2022-11-23T03:12:19.2848337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2848449Z test(self, **param_kwargs) 2022-11-23T03:12:19.2848783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2848895Z return func(*args, **kwargs) 2022-11-23T03:12:19.2849116Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2849216Z self.run_subtests( 2022-11-23T03:12:19.2849543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2849690Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2850023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2850162Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2850742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2850861Z output = model(*input) 2022-11-23T03:12:19.2851178Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2851309Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2851673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2851841Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2852189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2852299Z _lazy_init(state, module) 2022-11-23T03:12:19.2852637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2852821Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2853153Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2853270Z return func(*args, **kwargs) 2022-11-23T03:12:19.2853639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2853744Z p_assert( 2022-11-23T03:12:19.2854063Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2854178Z traceback.print_stack() 2022-11-23T03:12:19.2854299Z File "", line 1, in 2022-11-23T03:12:19.2854502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2854636Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2854828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2854974Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2855173Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2855269Z self.run() 2022-11-23T03:12:19.2855461Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2855598Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2855928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2856049Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2856395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2856507Z getattr(self, test_name)() 2022-11-23T03:12:19.2856847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2856935Z fn() 2022-11-23T03:12:19.2857452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2857562Z test(self, **param_kwargs) 2022-11-23T03:12:19.2857895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2858006Z return func(*args, **kwargs) 2022-11-23T03:12:19.2858227Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2858329Z self.run_subtests( 2022-11-23T03:12:19.2858652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2858800Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2859139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2859275Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2859678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2859792Z output = model(*input) 2022-11-23T03:12:19.2860098Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2860225Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2860570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2860733Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2861075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2861181Z _lazy_init(state, module) 2022-11-23T03:12:19.2861692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2861889Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2862219Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2862337Z return func(*args, **kwargs) 2022-11-23T03:12:19.2862699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2862795Z p_assert( 2022-11-23T03:12:19.2863119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2863233Z traceback.print_stack() 2022-11-23T03:12:19.2863352Z File "", line 1, in 2022-11-23T03:12:19.2863552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2863684Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2864093Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2864247Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2864455Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2864551Z self.run() 2022-11-23T03:12:19.2864743Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2864879Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2865214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2865338Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2865681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2865798Z getattr(self, test_name)() 2022-11-23T03:12:19.2866251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2866344Z fn() 2022-11-23T03:12:19.2866706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2866821Z test(self, **param_kwargs) 2022-11-23T03:12:19.2867164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2867280Z return func(*args, **kwargs) 2022-11-23T03:12:19.2867504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2867608Z self.run_subtests( 2022-11-23T03:12:19.2868109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2868257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2868596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2868734Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2869158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2869275Z output = model(*input) 2022-11-23T03:12:19.2869574Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2869702Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2870053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2870212Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2870556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2870666Z _lazy_init(state, module) 2022-11-23T03:12:19.2870993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2871184Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2871499Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2871612Z return func(*args, **kwargs) 2022-11-23T03:12:19.2871970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2872061Z p_assert( 2022-11-23T03:12:19.2872373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2872485Z traceback.print_stack() 2022-11-23T03:12:19.2872600Z File "", line 1, in 2022-11-23T03:12:19.2872793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2872914Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2873101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2873243Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2873442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2873534Z self.run() 2022-11-23T03:12:19.2873899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2874035Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2874359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2874483Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2874834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2874949Z getattr(self, test_name)() 2022-11-23T03:12:19.2875296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2875389Z fn() 2022-11-23T03:12:19.2875748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2875862Z test(self, **param_kwargs) 2022-11-23T03:12:19.2876201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2876317Z return func(*args, **kwargs) 2022-11-23T03:12:19.2876544Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2876648Z self.run_subtests( 2022-11-23T03:12:19.2876992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2877147Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2877502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2877649Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2878054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2878176Z output = model(*input) 2022-11-23T03:12:19.2878492Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2878624Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2878989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2879154Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2879509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2879621Z _lazy_init(state, module) 2022-11-23T03:12:19.2879951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2880130Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2880464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2880580Z return func(*args, **kwargs) 2022-11-23T03:12:19.2880948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2881041Z p_assert( 2022-11-23T03:12:19.2881369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2881488Z traceback.print_stack() 2022-11-23T03:12:19.2881709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2881936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2882159Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2882388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2882511Z File "", line 1, in 2022-11-23T03:12:19.2882711Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2882846Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2883038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2883173Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2883378Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2883474Z self.run() 2022-11-23T03:12:19.2883667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2883803Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2884136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2884264Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2884617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2884727Z getattr(self, test_name)() 2022-11-23T03:12:19.2885073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2885161Z fn() 2022-11-23T03:12:19.2885514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2885628Z test(self, **param_kwargs) 2022-11-23T03:12:19.2886131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2886244Z return func(*args, **kwargs) 2022-11-23T03:12:19.2886464Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2886563Z self.run_subtests( 2022-11-23T03:12:19.2886935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2887092Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2887434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2887573Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2887924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2888030Z output = model(*input) 2022-11-23T03:12:19.2888333Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2888456Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2888809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2889022Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2889366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2889526Z _lazy_init(state, module) 2022-11-23T03:12:19.2889858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2889989Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2890303Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2890407Z return func(*args, **kwargs) 2022-11-23T03:12:19.2890760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2890851Z p_assert( 2022-11-23T03:12:19.2891166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2891287Z traceback.print_stack() 2022-11-23T03:12:19.2891404Z File "", line 1, in 2022-11-23T03:12:19.2891599Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2891725Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2891910Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2892047Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2892242Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2892334Z self.run() 2022-11-23T03:12:19.2892521Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2892651Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2892970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2893088Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2893427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2893540Z getattr(self, test_name)() 2022-11-23T03:12:19.2893874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2893963Z fn() 2022-11-23T03:12:19.2894305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2894417Z test(self, **param_kwargs) 2022-11-23T03:12:19.2894753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2894858Z return func(*args, **kwargs) 2022-11-23T03:12:19.2895077Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2895183Z self.run_subtests( 2022-11-23T03:12:19.2895557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2895890Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2896245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2896387Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2896751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2896854Z output = model(*input) 2022-11-23T03:12:19.2897168Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2897300Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2897667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2897884Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2898239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2898351Z _lazy_init(state, module) 2022-11-23T03:12:19.2898691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2898817Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2899140Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2899257Z return func(*args, **kwargs) 2022-11-23T03:12:19.2899952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2900045Z p_assert( 2022-11-23T03:12:19.2900373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2900496Z traceback.print_stack() 2022-11-23T03:12:19.2900617Z File "", line 1, in 2022-11-23T03:12:19.2900810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2900945Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2901135Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2901276Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2901479Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2901573Z self.run() 2022-11-23T03:12:19.2901768Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2901901Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2902228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2902358Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2902713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2902984Z getattr(self, test_name)() 2022-11-23T03:12:19.2903321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2903406Z fn() 2022-11-23T03:12:19.2903744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2904726Z test(self, **param_kwargs) 2022-11-23T03:12:19.2905097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2905213Z return func(*args, **kwargs) 2022-11-23T03:12:19.2905440Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2905550Z self.run_subtests( 2022-11-23T03:12:19.2905961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2906125Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2906480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2906618Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2906982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2907092Z output = model(*input) 2022-11-23T03:12:19.2907407Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2907537Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2907902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2908135Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2908494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2908602Z _lazy_init(state, module) 2022-11-23T03:12:19.2908944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2909079Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2909408Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2909524Z return func(*args, **kwargs) 2022-11-23T03:12:19.2909890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2909983Z p_assert( 2022-11-23T03:12:19.2910312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2910427Z traceback.print_stack() 2022-11-23T03:12:19.2910548Z File "", line 1, in 2022-11-23T03:12:19.2910750Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2910882Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2911073Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2911212Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2911413Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2911508Z self.run() 2022-11-23T03:12:19.2911696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2911830Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2912157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2912287Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2912637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2912750Z getattr(self, test_name)() 2022-11-23T03:12:19.2913099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2913179Z fn() 2022-11-23T03:12:19.2913531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2913643Z test(self, **param_kwargs) 2022-11-23T03:12:19.2913987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2914102Z return func(*args, **kwargs) 2022-11-23T03:12:19.2914331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2914441Z self.run_subtests( 2022-11-23T03:12:19.2914826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2914978Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2915335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2915476Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2915839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2915948Z output = model(*input) 2022-11-23T03:12:19.2916262Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2916393Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2916754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2916964Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2917658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2917774Z _lazy_init(state, module) 2022-11-23T03:12:19.2918114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2918247Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2918570Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2918685Z return func(*args, **kwargs) 2022-11-23T03:12:19.2919052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2919140Z p_assert( 2022-11-23T03:12:19.2919472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2919591Z traceback.print_stack() 2022-11-23T03:12:19.2919820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2920046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2920268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2920652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2920771Z File "", line 1, in 2022-11-23T03:12:19.2920959Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2921087Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2921272Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2921412Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2921609Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2921700Z self.run() 2022-11-23T03:12:19.2921885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2922016Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2922330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2922451Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2922790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2922902Z getattr(self, test_name)() 2022-11-23T03:12:19.2923237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2923322Z fn() 2022-11-23T03:12:19.2923709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2923828Z test(self, **param_kwargs) 2022-11-23T03:12:19.2924158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2924272Z return func(*args, **kwargs) 2022-11-23T03:12:19.2924494Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2924594Z self.run_subtests( 2022-11-23T03:12:19.2924925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2925073Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2925414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2925553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2925967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2926077Z output = model(*input) 2022-11-23T03:12:19.2926384Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2926511Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2927049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2927217Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2927571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2927683Z _lazy_init(state, module) 2022-11-23T03:12:19.2928017Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2928155Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2928485Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2928602Z return func(*args, **kwargs) 2022-11-23T03:12:19.2928967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2929061Z p_assert( 2022-11-23T03:12:19.2929387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2929665Z traceback.print_stack() 2022-11-23T03:12:19.2929776Z File "", line 1, in 2022-11-23T03:12:19.2929968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2930096Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2930280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2930423Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2930622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2930715Z self.run() 2022-11-23T03:12:19.2931078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2931217Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2931546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2931669Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2932016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2932131Z getattr(self, test_name)() 2022-11-23T03:12:19.2932478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2932566Z fn() 2022-11-23T03:12:19.2932964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2933083Z test(self, **param_kwargs) 2022-11-23T03:12:19.2933489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2933604Z return func(*args, **kwargs) 2022-11-23T03:12:19.2933987Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2934088Z self.run_subtests( 2022-11-23T03:12:19.2934416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2934563Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2934893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2935033Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2935447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2935552Z output = model(*input) 2022-11-23T03:12:19.2935854Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2935981Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2936522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2936690Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2937036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2937149Z _lazy_init(state, module) 2022-11-23T03:12:19.2937489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2937627Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2937958Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2938074Z return func(*args, **kwargs) 2022-11-23T03:12:19.2938445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2938538Z p_assert( 2022-11-23T03:12:19.2938858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2938974Z traceback.print_stack() 2022-11-23T03:12:19.2939095Z File "", line 1, in 2022-11-23T03:12:19.2939458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2939588Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2939946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2940093Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2940296Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2940394Z self.run() 2022-11-23T03:12:19.2940588Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2940724Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2941056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2941179Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2941528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2941643Z getattr(self, test_name)() 2022-11-23T03:12:19.2941981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2942070Z fn() 2022-11-23T03:12:19.2942475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2942595Z test(self, **param_kwargs) 2022-11-23T03:12:19.2942943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2943056Z return func(*args, **kwargs) 2022-11-23T03:12:19.2943284Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2943419Z self.run_subtests( 2022-11-23T03:12:19.2943802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2944171Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2944536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2944680Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2945128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2945238Z output = model(*input) 2022-11-23T03:12:19.2945550Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2945681Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2946036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2946204Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2946563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2946675Z _lazy_init(state, module) 2022-11-23T03:12:19.2947014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2947153Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2947483Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2947598Z return func(*args, **kwargs) 2022-11-23T03:12:19.2948118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2948211Z p_assert( 2022-11-23T03:12:19.2948525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2948637Z traceback.print_stack() 2022-11-23T03:12:19.2948753Z File "", line 1, in 2022-11-23T03:12:19.2948947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2949077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2949442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2949582Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2949788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2949883Z self.run() 2022-11-23T03:12:19.2950077Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2950212Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2950542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2950667Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2951011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2951127Z getattr(self, test_name)() 2022-11-23T03:12:19.2951472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2951559Z fn() 2022-11-23T03:12:19.2951974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2952097Z test(self, **param_kwargs) 2022-11-23T03:12:19.2952444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2952559Z return func(*args, **kwargs) 2022-11-23T03:12:19.2952781Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2952887Z self.run_subtests( 2022-11-23T03:12:19.2953229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2953380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2953736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2953877Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2954294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2954404Z output = model(*input) 2022-11-23T03:12:19.2954711Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2954846Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2955212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2955378Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2955731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2955842Z _lazy_init(state, module) 2022-11-23T03:12:19.2956180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2956317Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2956641Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2956761Z return func(*args, **kwargs) 2022-11-23T03:12:19.2957129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2957221Z p_assert( 2022-11-23T03:12:19.2957709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2957822Z traceback.print_stack() 2022-11-23T03:12:19.2958042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2958262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2958469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2958691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2958809Z File "", line 1, in 2022-11-23T03:12:19.2959004Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2959131Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2959318Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2959454Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2959651Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2959735Z self.run() 2022-11-23T03:12:19.2959922Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2960050Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2960371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2960495Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2960879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2960995Z getattr(self, test_name)() 2022-11-23T03:12:19.2961334Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2961414Z fn() 2022-11-23T03:12:19.2961754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2961862Z test(self, **param_kwargs) 2022-11-23T03:12:19.2962193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2962303Z return func(*args, **kwargs) 2022-11-23T03:12:19.2962522Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2962668Z self.run_subtests( 2022-11-23T03:12:19.2963181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2963338Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2963692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2963834Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2964198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2964310Z output = model(*input) 2022-11-23T03:12:19.2964623Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2964754Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2965118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2965284Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2965641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2965753Z _lazy_init(state, module) 2022-11-23T03:12:19.2966253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2966384Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2966698Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2966808Z return func(*args, **kwargs) 2022-11-23T03:12:19.2967162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2967246Z p_assert( 2022-11-23T03:12:19.2967559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2967678Z traceback.print_stack() 2022-11-23T03:12:19.2967796Z File "", line 1, in 2022-11-23T03:12:19.2967990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2968116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2968300Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2968430Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2968626Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2968718Z self.run() 2022-11-23T03:12:19.2968904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2969035Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2969357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2969478Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2969861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2969972Z getattr(self, test_name)() 2022-11-23T03:12:19.2970312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2970397Z fn() 2022-11-23T03:12:19.2970739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2970850Z test(self, **param_kwargs) 2022-11-23T03:12:19.2971181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2971292Z return func(*args, **kwargs) 2022-11-23T03:12:19.2971512Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2971652Z self.run_subtests( 2022-11-23T03:12:19.2971987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2972137Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2972477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2972615Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2972965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2973070Z output = model(*input) 2022-11-23T03:12:19.2973376Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2973500Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2973851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2974018Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2974554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2974666Z _lazy_init(state, module) 2022-11-23T03:12:19.2975006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2975138Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2975462Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2975571Z return func(*args, **kwargs) 2022-11-23T03:12:19.2975937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2976031Z p_assert( 2022-11-23T03:12:19.2976355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2976480Z traceback.print_stack() 2022-11-23T03:12:19.2976602Z File "", line 1, in 2022-11-23T03:12:19.2976801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2976929Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2977123Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2977420Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2977615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2977706Z self.run() 2022-11-23T03:12:19.2977892Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2978022Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2978337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2978453Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2978871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2978988Z getattr(self, test_name)() 2022-11-23T03:12:19.2979513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2979601Z fn() 2022-11-23T03:12:19.2979953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2980066Z test(self, **param_kwargs) 2022-11-23T03:12:19.2980409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2980519Z return func(*args, **kwargs) 2022-11-23T03:12:19.2980747Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2980900Z self.run_subtests( 2022-11-23T03:12:19.2981245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2981397Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2981750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2981895Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2982261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2982365Z output = model(*input) 2022-11-23T03:12:19.2982680Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2982811Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2983174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2983349Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2983706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2983819Z _lazy_init(state, module) 2022-11-23T03:12:19.2984717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2984850Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2985179Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2985295Z return func(*args, **kwargs) 2022-11-23T03:12:19.2985661Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2985754Z p_assert( 2022-11-23T03:12:19.2986080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2986207Z traceback.print_stack() 2022-11-23T03:12:19.2986326Z File "", line 1, in 2022-11-23T03:12:19.2986520Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2986653Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2986844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2986984Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2987186Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2987280Z self.run() 2022-11-23T03:12:19.2987473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2987602Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2987931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2988057Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2988628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2988746Z getattr(self, test_name)() 2022-11-23T03:12:19.2989083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2989167Z fn() 2022-11-23T03:12:19.2989556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.2989662Z test(self, **param_kwargs) 2022-11-23T03:12:19.2989997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.2990108Z return func(*args, **kwargs) 2022-11-23T03:12:19.2990328Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.2990502Z self.run_subtests( 2022-11-23T03:12:19.2990837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.2990985Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.2991326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.2991640Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.2992003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.2992115Z output = model(*input) 2022-11-23T03:12:19.2992429Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.2992561Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.2992927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.2993101Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.2993460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.2993566Z _lazy_init(state, module) 2022-11-23T03:12:19.2993907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.2994042Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.2994528Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.2994641Z return func(*args, **kwargs) 2022-11-23T03:12:19.2994997Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.2995088Z p_assert( 2022-11-23T03:12:19.2995407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.2995516Z traceback.print_stack() 2022-11-23T03:12:19.2995734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2995951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2996348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2996569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.2996691Z File "", line 1, in 2022-11-23T03:12:19.2996892Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.2997028Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.2997213Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.2997354Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.2997627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.2997730Z self.run() 2022-11-23T03:12:19.2997922Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.2998057Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.2998387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.2998511Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.2998854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.2998972Z getattr(self, test_name)() 2022-11-23T03:12:19.2999467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.2999553Z fn() 2022-11-23T03:12:19.2999892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3000231Z test(self, **param_kwargs) 2022-11-23T03:12:19.3000579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3000687Z return func(*args, **kwargs) 2022-11-23T03:12:19.3000914Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3001017Z self.run_subtests( 2022-11-23T03:12:19.3001359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3001512Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3001864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3002009Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3002383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3002497Z output = model(*input) 2022-11-23T03:12:19.3002807Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3002937Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3003457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3003619Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3003962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3004071Z _lazy_init(state, module) 2022-11-23T03:12:19.3004400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3004532Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3005032Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3005150Z return func(*args, **kwargs) 2022-11-23T03:12:19.3005518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3005610Z p_assert( 2022-11-23T03:12:19.3005938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3006054Z traceback.print_stack() 2022-11-23T03:12:19.3006175Z File "", line 1, in 2022-11-23T03:12:19.3006368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3006503Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3006695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3006840Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3007089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3007192Z self.run() 2022-11-23T03:12:19.3007384Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3007519Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3007844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3007969Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3008320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3008434Z getattr(self, test_name)() 2022-11-23T03:12:19.3008783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3008871Z fn() 2022-11-23T03:12:19.3009377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3009538Z test(self, **param_kwargs) 2022-11-23T03:12:19.3009869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3009982Z return func(*args, **kwargs) 2022-11-23T03:12:19.3010206Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3010308Z self.run_subtests( 2022-11-23T03:12:19.3010636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3010783Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3011121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3011259Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3011798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3011909Z output = model(*input) 2022-11-23T03:12:19.3012221Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3012352Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3012717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3012884Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3013235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3013347Z _lazy_init(state, module) 2022-11-23T03:12:19.3013681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3013820Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3014151Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3014267Z return func(*args, **kwargs) 2022-11-23T03:12:19.3014632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3014726Z p_assert( 2022-11-23T03:12:19.3015049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3015165Z traceback.print_stack() 2022-11-23T03:12:19.3015279Z File "", line 1, in 2022-11-23T03:12:19.3015478Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3015610Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3015958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3016100Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3016345Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3016442Z self.run() 2022-11-23T03:12:19.3016622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3016752Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3017069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3017187Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3017523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3017632Z getattr(self, test_name)() 2022-11-23T03:12:19.3018145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3018235Z fn() 2022-11-23T03:12:19.3018639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3018754Z test(self, **param_kwargs) 2022-11-23T03:12:19.3019097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3019212Z return func(*args, **kwargs) 2022-11-23T03:12:19.3019439Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3019541Z self.run_subtests( 2022-11-23T03:12:19.3019881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3020034Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3020380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3020525Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3021054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3021163Z output = model(*input) 2022-11-23T03:12:19.3021466Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3021596Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3021947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3022106Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3022443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3022553Z _lazy_init(state, module) 2022-11-23T03:12:19.3022882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3023017Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3023339Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3023452Z return func(*args, **kwargs) 2022-11-23T03:12:19.3023808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3024112Z p_assert( 2022-11-23T03:12:19.3024611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3024728Z traceback.print_stack() 2022-11-23T03:12:19.3024847Z File "", line 1, in 2022-11-23T03:12:19.3025045Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3025177Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3025370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3025512Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3025783Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3025892Z self.run() 2022-11-23T03:12:19.3026087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3026223Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3026553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3026676Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3027024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3027137Z getattr(self, test_name)() 2022-11-23T03:12:19.3027479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3027569Z fn() 2022-11-23T03:12:19.3027993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3028106Z test(self, **param_kwargs) 2022-11-23T03:12:19.3028602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3028713Z return func(*args, **kwargs) 2022-11-23T03:12:19.3028933Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3029035Z self.run_subtests( 2022-11-23T03:12:19.3029359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3029506Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3029848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3029986Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3030347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3030452Z output = model(*input) 2022-11-23T03:12:19.3030756Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3030884Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3031414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3031583Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3031940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3032051Z _lazy_init(state, module) 2022-11-23T03:12:19.3032394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3032531Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3032860Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3032975Z return func(*args, **kwargs) 2022-11-23T03:12:19.3033387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3033486Z p_assert( 2022-11-23T03:12:19.3033815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3033931Z traceback.print_stack() 2022-11-23T03:12:19.3034315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3034527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3034741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3034959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3035216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3035432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3035646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3035857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3036065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3036273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3036481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3036870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3037674Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3038410Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3039140Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3040034Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3040928Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3041651Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3042371Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3043088Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3043806Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3044575Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3045304Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3046025Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3047117Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3047838Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3048553Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3049275Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3049506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3049893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3050111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3050324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3050534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3050744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3050963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3051351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3051562Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3051779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3051997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3052213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3052430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3052645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3052860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3053123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3053339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3053553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3053767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3053979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3054192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3054406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3054620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3054832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3055633Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3056353Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3057077Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3057969Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3058827Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3059546Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3060267Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3060985Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3061213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3061435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3061655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3061925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3062151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3062364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3062582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3062797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3063162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3063556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3063772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3064259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3064488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3064709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3064921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3065140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3065348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3065567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3065781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3065996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3066215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3066428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3066641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3066854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3067588Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3068473Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3069174Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3069872Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3070832Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3071570Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3072285Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3073006Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3073284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3073505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3073725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3073944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3074164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3074379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3074595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3074813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3075029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3075250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3075467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3075685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3075900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3076114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3076329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3076545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3076757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3076970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3077188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3077405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3077618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3077983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3078191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3078397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3079154Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3080065Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3080791Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3081517Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3082289Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3083009Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3083729Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3084456Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3084685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3084901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3085121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3085343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3085564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3085787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3086004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3086220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3086433Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3086642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3087090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3087404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3087507Z dist init r=1, world=4 2022-11-23T03:12:19.3087872Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3088191Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3088488Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3088781Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3089073Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3089364Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3089748Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3090043Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3090494Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3090773Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3091052Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3091337Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3091438Z dist init r=3, world=4 2022-11-23T03:12:19.3091740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3092037Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3092325Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3092612Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3092896Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3093179Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3093462Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3093743Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3094024Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3094344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3094637Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3095095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3095197Z dist init r=0, world=4 2022-11-23T03:12:19.3095509Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3095817Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3096119Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3096455Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3096750Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3097041Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3097335Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3097624Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3097919Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3098208Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3098499Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3098789Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3098891Z dist init r=2, world=4 2022-11-23T03:12:19.3099204Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3099668Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3099961Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3100248Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3100713Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3101008Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3101351Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3101648Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3101940Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3102232Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3102524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3102815Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3102962Z ok (10.832s) 2022-11-23T03:12:19.3103294Z test_transformer_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29483 2022-11-23T03:12:19.3103662Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29484 2022-11-23T03:12:19.3104074Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29485 2022-11-23T03:12:19.3104287Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29486 2022-11-23T03:12:19.3104832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.3105003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.3105376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.3105570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.3105921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.3106091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.3106457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.3106638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.3106994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.3107160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.3107526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.3107715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.3115734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:12:19.3115955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:12:19.3116524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:12:19.3116696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:12:19.3116926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:12:19.3117156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:12:19.3117379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:12:19.3117603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:12:19.3118100Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.3118681Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.3119061Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.3119435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:12:19.3119649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:12:19.3119869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:12:19.3120085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:12:19.3120422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:12:19.3120649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3120875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3121099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3121480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3122461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.3122570Z warnings.warn( 2022-11-23T03:12:19.3123543Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.3123635Z warnings.warn( 2022-11-23T03:12:19.3124779Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.3124884Z warnings.warn( 2022-11-23T03:12:19.3125887Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:12:19.3125990Z warnings.warn( 2022-11-23T03:12:19.3126110Z File "", line 1, in 2022-11-23T03:12:19.3126315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3126448Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3126643Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3126835Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3126957Z File "", line 1, in 2022-11-23T03:12:19.3127163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3127257Z self.run() 2022-11-23T03:12:19.3127615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3127748Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3127939Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3128066Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3128392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3128509Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3128690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3128875Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3129220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3129333Z getattr(self, test_name)() 2022-11-23T03:12:19.3129528Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3129620Z self.run() 2022-11-23T03:12:19.3129957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3130040Z fn() 2022-11-23T03:12:19.3130229Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3130365Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3130712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3130821Z test(self, **param_kwargs) 2022-11-23T03:12:19.3131144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3131264Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3131775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3131894Z return func(*args, **kwargs) 2022-11-23T03:12:19.3132247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3132361Z getattr(self, test_name)() 2022-11-23T03:12:19.3132594Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3132698Z self.run_subtests( 2022-11-23T03:12:19.3133049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3133137Z fn() 2022-11-23T03:12:19.3133540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3133705Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3134062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3134176Z test(self, **param_kwargs) 2022-11-23T03:12:19.3134679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3134819Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3135150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3135261Z return func(*args, **kwargs) 2022-11-23T03:12:19.3135605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3135721Z output = model(*input) 2022-11-23T03:12:19.3135991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3136099Z self.run_subtests( 2022-11-23T03:12:19.3136405Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3136533Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3136860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3137189Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3137549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3137717Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3138070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3138259Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3138620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3138733Z _lazy_init(state, module) 2022-11-23T03:12:19.3139098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3139207Z output = model(*input) 2022-11-23T03:12:19.3139548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3139674Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3140150Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3140277Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3140769Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3140888Z return func(*args, **kwargs) 2022-11-23T03:12:19.3141255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3141424Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3141793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3141882Z p_assert( 2022-11-23T03:12:19.3142239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3142350Z _lazy_init(state, module) 2022-11-23T03:12:19.3142677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3142794Z traceback.print_stack() 2022-11-23T03:12:19.3143131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3143270Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3143598Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3143706Z return func(*args, **kwargs) 2022-11-23T03:12:19.3144357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3144458Z p_assert( 2022-11-23T03:12:19.3144793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3144909Z traceback.print_stack() 2022-11-23T03:12:19.3145031Z File "", line 1, in 2022-11-23T03:12:19.3145233Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3145361Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3145554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3145783Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3145997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3146091Z self.run() 2022-11-23T03:12:19.3146282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3146419Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3146752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3146871Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3147222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3147336Z getattr(self, test_name)() 2022-11-23T03:12:19.3147684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3147835Z fn() 2022-11-23T03:12:19.3148350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3148460Z test(self, **param_kwargs) 2022-11-23T03:12:19.3148795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3148900Z return func(*args, **kwargs) 2022-11-23T03:12:19.3149121Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3149221Z self.run_subtests( 2022-11-23T03:12:19.3149550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3149697Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3150038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3150181Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3150536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3150638Z output = model(*input) 2022-11-23T03:12:19.3150943Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3151069Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3151592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3151760Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3152112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3152225Z _lazy_init(state, module) 2022-11-23T03:12:19.3152567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3152701Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3153030Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3153147Z return func(*args, **kwargs) 2022-11-23T03:12:19.3153510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3153603Z p_assert( 2022-11-23T03:12:19.3153927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3154044Z traceback.print_stack() 2022-11-23T03:12:19.3154157Z File "", line 1, in 2022-11-23T03:12:19.3154359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3154490Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3154684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3154872Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3155083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3155179Z self.run() 2022-11-23T03:12:19.3155376Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3155506Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3155836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3155961Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3156309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3156422Z getattr(self, test_name)() 2022-11-23T03:12:19.3156770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3156903Z fn() 2022-11-23T03:12:19.3157264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3157370Z test(self, **param_kwargs) 2022-11-23T03:12:19.3157713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3157828Z return func(*args, **kwargs) 2022-11-23T03:12:19.3158058Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3158327Z self.run_subtests( 2022-11-23T03:12:19.3158653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3158799Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3159139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3159277Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3159628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3159736Z output = model(*input) 2022-11-23T03:12:19.3160040Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3160169Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3160519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3160680Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3161023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3161126Z _lazy_init(state, module) 2022-11-23T03:12:19.3161451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3161586Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3162084Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3162199Z return func(*args, **kwargs) 2022-11-23T03:12:19.3162567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3162660Z p_assert( 2022-11-23T03:12:19.3162982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3163094Z traceback.print_stack() 2022-11-23T03:12:19.3163323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3163549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3163771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3164042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3164171Z File "", line 1, in 2022-11-23T03:12:19.3164375Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3164509Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3164695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3164833Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3165035Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3165129Z self.run() 2022-11-23T03:12:19.3165324Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3165461Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3165797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3165962Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3166441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3166556Z getattr(self, test_name)() 2022-11-23T03:12:19.3167065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3167153Z fn() 2022-11-23T03:12:19.3167496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3167606Z test(self, **param_kwargs) 2022-11-23T03:12:19.3167937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3168043Z return func(*args, **kwargs) 2022-11-23T03:12:19.3168263Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3168372Z self.run_subtests( 2022-11-23T03:12:19.3168705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3168853Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3169190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3169329Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3169682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3169781Z output = model(*input) 2022-11-23T03:12:19.3170083Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3170208Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3170565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3170732Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3171074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3171185Z _lazy_init(state, module) 2022-11-23T03:12:19.3171513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3171637Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3171950Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3172060Z return func(*args, **kwargs) 2022-11-23T03:12:19.3172411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3172500Z p_assert( 2022-11-23T03:12:19.3172876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3172996Z traceback.print_stack() 2022-11-23T03:12:19.3173111Z File "", line 1, in 2022-11-23T03:12:19.3173297Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3173426Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3173611Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3173750Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3173944Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3174034Z self.run() 2022-11-23T03:12:19.3174221Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3174346Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3174847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3175040Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3175393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3175506Z getattr(self, test_name)() 2022-11-23T03:12:19.3175850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3175938Z fn() 2022-11-23T03:12:19.3176287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3176394Z test(self, **param_kwargs) 2022-11-23T03:12:19.3176741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3176858Z return func(*args, **kwargs) 2022-11-23T03:12:19.3177087Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3177198Z self.run_subtests( 2022-11-23T03:12:19.3177540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3177692Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3178041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3178177Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3178539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3178649Z output = model(*input) 2022-11-23T03:12:19.3178961Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3179093Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3179463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3179630Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3179987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3180091Z _lazy_init(state, module) 2022-11-23T03:12:19.3180430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3180565Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3180891Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3181006Z return func(*args, **kwargs) 2022-11-23T03:12:19.3181372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3181468Z p_assert( 2022-11-23T03:12:19.3181840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3181956Z traceback.print_stack() 2022-11-23T03:12:19.3182075Z File "", line 1, in 2022-11-23T03:12:19.3182271Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3182403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3182594Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3182736Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3182936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3183029Z self.run() 2022-11-23T03:12:19.3183214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3183349Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3183679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3184120Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3184502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3184616Z getattr(self, test_name)() 2022-11-23T03:12:19.3184961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3185049Z fn() 2022-11-23T03:12:19.3185395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3185511Z test(self, **param_kwargs) 2022-11-23T03:12:19.3185856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3185970Z return func(*args, **kwargs) 2022-11-23T03:12:19.3186199Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3186311Z self.run_subtests( 2022-11-23T03:12:19.3186653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3186798Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3187149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3187294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3187656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3187765Z output = model(*input) 2022-11-23T03:12:19.3188080Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3188374Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3188737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3188900Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3189238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3189345Z _lazy_init(state, module) 2022-11-23T03:12:19.3189719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3189851Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3190165Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3190278Z return func(*args, **kwargs) 2022-11-23T03:12:19.3190638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3190731Z p_assert( 2022-11-23T03:12:19.3191114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3191237Z traceback.print_stack() 2022-11-23T03:12:19.3191353Z File "", line 1, in 2022-11-23T03:12:19.3191549Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3191679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3191865Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3192001Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3192192Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3192283Z self.run() 2022-11-23T03:12:19.3192470Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3192601Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3192984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3193107Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3193444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3193554Z getattr(self, test_name)() 2022-11-23T03:12:19.3193883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3193968Z fn() 2022-11-23T03:12:19.3194308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3194419Z test(self, **param_kwargs) 2022-11-23T03:12:19.3194749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3194860Z return func(*args, **kwargs) 2022-11-23T03:12:19.3195080Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3195188Z self.run_subtests( 2022-11-23T03:12:19.3195514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3195660Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3195998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3196135Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3196662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3196773Z output = model(*input) 2022-11-23T03:12:19.3197086Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3197217Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3197584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3197751Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3198105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3198217Z _lazy_init(state, module) 2022-11-23T03:12:19.3198556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3198691Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3199016Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3199131Z return func(*args, **kwargs) 2022-11-23T03:12:19.3199491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3199589Z p_assert( 2022-11-23T03:12:19.3200119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3200238Z traceback.print_stack() 2022-11-23T03:12:19.3200458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3200676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3201072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3201296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3201410Z File "", line 1, in 2022-11-23T03:12:19.3201609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3201742Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3201932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3202125Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3202333Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3202428Z self.run() 2022-11-23T03:12:19.3202615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3202753Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3203086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3203213Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3203562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3203677Z getattr(self, test_name)() 2022-11-23T03:12:19.3204352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3204447Z fn() 2022-11-23T03:12:19.3204803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3204916Z test(self, **param_kwargs) 2022-11-23T03:12:19.3205263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3205378Z return func(*args, **kwargs) 2022-11-23T03:12:19.3205605Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3205708Z self.run_subtests( 2022-11-23T03:12:19.3206051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3206204Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3206547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3206694Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3207059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3207171Z output = model(*input) 2022-11-23T03:12:19.3207485Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3207617Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3207983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3208148Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3208498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3208609Z _lazy_init(state, module) 2022-11-23T03:12:19.3208950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3209134Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3209473Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3209585Z return func(*args, **kwargs) 2022-11-23T03:12:19.3209953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3210046Z p_assert( 2022-11-23T03:12:19.3210363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3210487Z traceback.print_stack() 2022-11-23T03:12:19.3210607Z File "", line 1, in 2022-11-23T03:12:19.3210806Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3210938Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3211129Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3211324Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3211529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3211617Z self.run() 2022-11-23T03:12:19.3211811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3211945Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3212274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3212399Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3212748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3212865Z getattr(self, test_name)() 2022-11-23T03:12:19.3213204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3213295Z fn() 2022-11-23T03:12:19.3213652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3213764Z test(self, **param_kwargs) 2022-11-23T03:12:19.3214108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3214223Z return func(*args, **kwargs) 2022-11-23T03:12:19.3214453Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3214557Z self.run_subtests( 2022-11-23T03:12:19.3214892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3215045Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3215393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3215540Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3215906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3216016Z output = model(*input) 2022-11-23T03:12:19.3216328Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3216460Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3216818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3216986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3217493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3217601Z _lazy_init(state, module) 2022-11-23T03:12:19.3218115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3218298Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3218636Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3218751Z return func(*args, **kwargs) 2022-11-23T03:12:19.3219112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3219206Z p_assert( 2022-11-23T03:12:19.3219531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3219648Z traceback.print_stack() 2022-11-23T03:12:19.3219768Z File "", line 1, in 2022-11-23T03:12:19.3219967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3220099Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3220291Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3220478Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3220681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3220780Z self.run() 2022-11-23T03:12:19.3220973Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3221107Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3221437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3221723Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3222064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3222168Z getattr(self, test_name)() 2022-11-23T03:12:19.3222504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3222592Z fn() 2022-11-23T03:12:19.3222936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3223046Z test(self, **param_kwargs) 2022-11-23T03:12:19.3223379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3223489Z return func(*args, **kwargs) 2022-11-23T03:12:19.3223704Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3223804Z self.run_subtests( 2022-11-23T03:12:19.3224558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3224715Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3225062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3225211Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3225578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3225689Z output = model(*input) 2022-11-23T03:12:19.3225999Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3226131Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3226498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3226663Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3227017Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3227126Z _lazy_init(state, module) 2022-11-23T03:12:19.3227466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3227673Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3228016Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3228125Z return func(*args, **kwargs) 2022-11-23T03:12:19.3228490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3228584Z p_assert( 2022-11-23T03:12:19.3228908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3229024Z traceback.print_stack() 2022-11-23T03:12:19.3229144Z File "", line 1, in 2022-11-23T03:12:19.3229345Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3229473Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3229666Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3230039Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3230235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3230325Z self.run() 2022-11-23T03:12:19.3230512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3230643Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3230961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3231076Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3231412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3231522Z getattr(self, test_name)() 2022-11-23T03:12:19.3231858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3232129Z fn() 2022-11-23T03:12:19.3232487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3232601Z test(self, **param_kwargs) 2022-11-23T03:12:19.3232946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3233056Z return func(*args, **kwargs) 2022-11-23T03:12:19.3233306Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3233434Z self.run_subtests( 2022-11-23T03:12:19.3233778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3233931Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3234281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3234428Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3234957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3235061Z output = model(*input) 2022-11-23T03:12:19.3235367Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3235495Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3235848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3236008Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3236352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3236460Z _lazy_init(state, module) 2022-11-23T03:12:19.3236786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3236958Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3237460Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3237576Z return func(*args, **kwargs) 2022-11-23T03:12:19.3237948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3238042Z p_assert( 2022-11-23T03:12:19.3238367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3238483Z traceback.print_stack() 2022-11-23T03:12:19.3238711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3238932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3239159Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3239435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3239557Z File "", line 1, in 2022-11-23T03:12:19.3239755Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3239888Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3240085Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3240220Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3240586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3240680Z self.run() 2022-11-23T03:12:19.3241035Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3241170Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3241500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3241633Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3241985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3242092Z getattr(self, test_name)() 2022-11-23T03:12:19.3242442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3242533Z fn() 2022-11-23T03:12:19.3242890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3243003Z test(self, **param_kwargs) 2022-11-23T03:12:19.3243347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3243519Z return func(*args, **kwargs) 2022-11-23T03:12:19.3243766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3243869Z self.run_subtests( 2022-11-23T03:12:19.3244214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3244367Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3244719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3244863Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3245227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3245337Z output = model(*input) 2022-11-23T03:12:19.3245650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3245777Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3246141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3246362Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3246730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3246842Z _lazy_init(state, module) 2022-11-23T03:12:19.3247341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3247471Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3247786Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3247891Z return func(*args, **kwargs) 2022-11-23T03:12:19.3248243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3248333Z p_assert( 2022-11-23T03:12:19.3248717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3248832Z traceback.print_stack() 2022-11-23T03:12:19.3248952Z File "", line 1, in 2022-11-23T03:12:19.3249146Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3249273Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3249453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3249767Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3249971Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3250064Z self.run() 2022-11-23T03:12:19.3250258Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3250395Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3250726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3250849Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3251202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3251318Z getattr(self, test_name)() 2022-11-23T03:12:19.3251666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3251754Z fn() 2022-11-23T03:12:19.3252106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3252221Z test(self, **param_kwargs) 2022-11-23T03:12:19.3252564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3252672Z return func(*args, **kwargs) 2022-11-23T03:12:19.3252900Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3253011Z self.run_subtests( 2022-11-23T03:12:19.3253354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3253507Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3253861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3254007Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3254370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3254475Z output = model(*input) 2022-11-23T03:12:19.3254792Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3254924Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3255288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3255504Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3255873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3255986Z _lazy_init(state, module) 2022-11-23T03:12:19.3256324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3256452Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3256779Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3256895Z return func(*args, **kwargs) 2022-11-23T03:12:19.3257265Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3257359Z p_assert( 2022-11-23T03:12:19.3257739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3257859Z traceback.print_stack() 2022-11-23T03:12:19.3257979Z File "", line 1, in 2022-11-23T03:12:19.3258171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3258303Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3258656Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3258793Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3258989Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3259080Z self.run() 2022-11-23T03:12:19.3259265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3259390Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3259707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3259831Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3260169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3260279Z getattr(self, test_name)() 2022-11-23T03:12:19.3260613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3260699Z fn() 2022-11-23T03:12:19.3261040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3261144Z test(self, **param_kwargs) 2022-11-23T03:12:19.3261475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3261586Z return func(*args, **kwargs) 2022-11-23T03:12:19.3261701Z File "", line 1, in 2022-11-23T03:12:19.3261929Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3262030Z self.run_subtests( 2022-11-23T03:12:19.3262358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3262507Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3262694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3262820Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3263161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3263300Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3263486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3263624Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3264485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3264612Z output = model(*input) 2022-11-23T03:12:19.3264814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3264913Z self.run() 2022-11-23T03:12:19.3265238Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3265370Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3265565Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3265700Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3266069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3266236Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3266626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3266755Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3267110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3267224Z _lazy_init(state, module) 2022-11-23T03:12:19.3267574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3267687Z getattr(self, test_name)() 2022-11-23T03:12:19.3268023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3268155Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3268654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3268741Z fn() 2022-11-23T03:12:19.3269057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3269177Z return func(*args, **kwargs) 2022-11-23T03:12:19.3269516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3269625Z test(self, **param_kwargs) 2022-11-23T03:12:19.3269977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3270066Z p_assert( 2022-11-23T03:12:19.3270394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3270505Z return func(*args, **kwargs) 2022-11-23T03:12:19.3270816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3270931Z traceback.print_stack() 2022-11-23T03:12:19.3271150Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3271257Z self.run_subtests( 2022-11-23T03:12:19.3271587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3271729Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3272070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3272211Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3272564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3272669Z output = model(*input) 2022-11-23T03:12:19.3272971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3273097Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3273497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3273667Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3274006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3274115Z _lazy_init(state, module) 2022-11-23T03:12:19.3274444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3274573Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3275064Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3275179Z return func(*args, **kwargs) 2022-11-23T03:12:19.3275544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3275684Z p_assert( 2022-11-23T03:12:19.3276010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3276128Z traceback.print_stack() 2022-11-23T03:12:19.3276356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3276583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3276804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3277025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3277145Z File "", line 1, in 2022-11-23T03:12:19.3277346Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3277474Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3277665Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3277809Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3278016Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3278110Z self.run() 2022-11-23T03:12:19.3278459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3278592Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3278902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3279022Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3279363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3279474Z getattr(self, test_name)() 2022-11-23T03:12:19.3279807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3279892Z fn() 2022-11-23T03:12:19.3280421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3280535Z test(self, **param_kwargs) 2022-11-23T03:12:19.3280873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3280990Z return func(*args, **kwargs) 2022-11-23T03:12:19.3281218Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3281323Z self.run_subtests( 2022-11-23T03:12:19.3281664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3281818Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3282168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3282312Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3282717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3282839Z output = model(*input) 2022-11-23T03:12:19.3283159Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3283290Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3283652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3283817Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3284171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3284284Z _lazy_init(state, module) 2022-11-23T03:12:19.3284615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3284798Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3285129Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3285244Z return func(*args, **kwargs) 2022-11-23T03:12:19.3285613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3285705Z p_assert( 2022-11-23T03:12:19.3286027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3286142Z traceback.print_stack() 2022-11-23T03:12:19.3286254Z File "", line 1, in 2022-11-23T03:12:19.3286452Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3286583Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3286778Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3286926Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3287130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3287225Z self.run() 2022-11-23T03:12:19.3287410Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3287547Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3288031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3288154Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3288270Z File "", line 1, in 2022-11-23T03:12:19.3288607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3288717Z getattr(self, test_name)() 2022-11-23T03:12:19.3289053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3289137Z fn() 2022-11-23T03:12:19.3289334Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3289463Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3289865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3289977Z test(self, **param_kwargs) 2022-11-23T03:12:19.3290161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3290299Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3290634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3290738Z return func(*args, **kwargs) 2022-11-23T03:12:19.3290934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3291024Z self.run() 2022-11-23T03:12:19.3291298Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3291404Z self.run_subtests( 2022-11-23T03:12:19.3291593Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3291724Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3292244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3292396Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3292719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3292843Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3293192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3293334Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3293737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3293851Z getattr(self, test_name)() 2022-11-23T03:12:19.3294209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3294319Z output = model(*input) 2022-11-23T03:12:19.3294825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3294910Z fn() 2022-11-23T03:12:19.3295212Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3295339Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3295680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3295789Z test(self, **param_kwargs) 2022-11-23T03:12:19.3296139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3296302Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3296636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3296748Z return func(*args, **kwargs) 2022-11-23T03:12:19.3297276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3297388Z _lazy_init(state, module) 2022-11-23T03:12:19.3297617Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3297720Z self.run_subtests( 2022-11-23T03:12:19.3298052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3298190Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3298534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3298687Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3299015Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3299130Z return func(*args, **kwargs) 2022-11-23T03:12:19.3299482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3299625Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3299982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3300075Z p_assert( 2022-11-23T03:12:19.3300586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3300700Z output = model(*input) 2022-11-23T03:12:19.3301059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3301357Z traceback.print_stack() 2022-11-23T03:12:19.3301676Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3301808Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3302162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3302328Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3302686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3302799Z _lazy_init(state, module) 2022-11-23T03:12:19.3303136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3303322Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3303651Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3303766Z return func(*args, **kwargs) 2022-11-23T03:12:19.3304688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3304786Z p_assert( 2022-11-23T03:12:19.3305112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3305227Z traceback.print_stack() 2022-11-23T03:12:19.3305348Z File "", line 1, in 2022-11-23T03:12:19.3305545Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3305682Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3305873Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3306018Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3306222Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3306318Z self.run() 2022-11-23T03:12:19.3306510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3306645Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3306974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3307098Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3307449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3307556Z getattr(self, test_name)() 2022-11-23T03:12:19.3307901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3307993Z fn() 2022-11-23T03:12:19.3308351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3308465Z test(self, **param_kwargs) 2022-11-23T03:12:19.3308810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3308927Z return func(*args, **kwargs) 2022-11-23T03:12:19.3309150Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3309253Z self.run_subtests( 2022-11-23T03:12:19.3309753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3310082Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3310435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3310582Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3311010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3311131Z output = model(*input) 2022-11-23T03:12:19.3311444Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3311578Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3311942Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3312107Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3312462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3312573Z _lazy_init(state, module) 2022-11-23T03:12:19.3312909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3313127Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3313457Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3313567Z return func(*args, **kwargs) 2022-11-23T03:12:19.3313936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3314031Z p_assert( 2022-11-23T03:12:19.3314354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3314469Z traceback.print_stack() 2022-11-23T03:12:19.3314697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3314923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3315148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3315371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3315493Z File "", line 1, in 2022-11-23T03:12:19.3315691Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3315822Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3316014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3316154Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3316354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3316442Z self.run() 2022-11-23T03:12:19.3316634Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3316929Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3317249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3317375Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3317716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3317825Z getattr(self, test_name)() 2022-11-23T03:12:19.3318161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3318240Z fn() 2022-11-23T03:12:19.3318582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3318696Z test(self, **param_kwargs) 2022-11-23T03:12:19.3319205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3319323Z return func(*args, **kwargs) 2022-11-23T03:12:19.3319549Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3319656Z self.run_subtests( 2022-11-23T03:12:19.3320044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3320198Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3320553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3320697Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3321060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3321169Z output = model(*input) 2022-11-23T03:12:19.3321484Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3321617Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3321982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3322197Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3322556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3322670Z _lazy_init(state, module) 2022-11-23T03:12:19.3323168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3323301Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3323614Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3323727Z return func(*args, **kwargs) 2022-11-23T03:12:19.3324079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3324161Z p_assert( 2022-11-23T03:12:19.3324477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3324597Z traceback.print_stack() 2022-11-23T03:12:19.3324714Z File "", line 1, in 2022-11-23T03:12:19.3324904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3325031Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3325216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3325346Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3325540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3325630Z self.run() 2022-11-23T03:12:19.3325814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3325945Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3326264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3326390Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3326730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3326834Z getattr(self, test_name)() 2022-11-23T03:12:19.3327171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3327256Z fn() 2022-11-23T03:12:19.3327598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3327707Z test(self, **param_kwargs) 2022-11-23T03:12:19.3328039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3328151Z return func(*args, **kwargs) 2022-11-23T03:12:19.3328371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3328469Z self.run_subtests( 2022-11-23T03:12:19.3328841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3328995Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3329337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3329475Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3329826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3329933Z output = model(*input) 2022-11-23T03:12:19.3330236Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3330358Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3330710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3330920Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3331266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3331374Z _lazy_init(state, module) 2022-11-23T03:12:19.3331699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3331827Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3332141Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3332246Z return func(*args, **kwargs) 2022-11-23T03:12:19.3332787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3332880Z p_assert( 2022-11-23T03:12:19.3333211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3333342Z traceback.print_stack() 2022-11-23T03:12:19.3333493Z File "", line 1, in 2022-11-23T03:12:19.3333693Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3333825Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3334011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3334151Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3334354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3334447Z self.run() 2022-11-23T03:12:19.3334639Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3334775Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3335103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3335388Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3335730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3335841Z getattr(self, test_name)() 2022-11-23T03:12:19.3336173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3336258Z fn() 2022-11-23T03:12:19.3336600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3336709Z test(self, **param_kwargs) 2022-11-23T03:12:19.3337226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3337336Z return func(*args, **kwargs) 2022-11-23T03:12:19.3337566Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3337675Z self.run_subtests( 2022-11-23T03:12:19.3338060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3338218Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3338572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3338717Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3339079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3339183Z output = model(*input) 2022-11-23T03:12:19.3339502Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3339633Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3339994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3340210Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3340568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3340681Z _lazy_init(state, module) 2022-11-23T03:12:19.3341174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3341470Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3341798Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3341913Z return func(*args, **kwargs) 2022-11-23T03:12:19.3342280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3342372Z p_assert( 2022-11-23T03:12:19.3342706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3342825Z traceback.print_stack() 2022-11-23T03:12:19.3342944Z File "", line 1, in 2022-11-23T03:12:19.3343136Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3343270Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3343460Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3343602Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3343802Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3344107Z self.run() 2022-11-23T03:12:19.3344310Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3344447Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3344776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3344912Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3345262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3345375Z getattr(self, test_name)() 2022-11-23T03:12:19.3345720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3345809Z fn() 2022-11-23T03:12:19.3346160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3346267Z test(self, **param_kwargs) 2022-11-23T03:12:19.3346611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3346726Z return func(*args, **kwargs) 2022-11-23T03:12:19.3346954Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3347127Z self.run_subtests( 2022-11-23T03:12:19.3347481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3347634Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3347983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3348120Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3348483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3348596Z output = model(*input) 2022-11-23T03:12:19.3348909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3349040Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3349404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3349635Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3349993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3350108Z _lazy_init(state, module) 2022-11-23T03:12:19.3350440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3350575Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3350900Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3351016Z return func(*args, **kwargs) 2022-11-23T03:12:19.3351383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3351476Z p_assert( 2022-11-23T03:12:19.3351808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3351919Z traceback.print_stack() 2022-11-23T03:12:19.3352146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3352373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3352595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3352816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3352936Z File "", line 1, in 2022-11-23T03:12:19.3353135Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3353266Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3353451Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3353596Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3353801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3353894Z self.run() 2022-11-23T03:12:19.3354085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3354221Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3354553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3354675Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3355019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3355132Z getattr(self, test_name)() 2022-11-23T03:12:19.3355481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3355571Z fn() 2022-11-23T03:12:19.3355988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3356112Z test(self, **param_kwargs) 2022-11-23T03:12:19.3356459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3356575Z return func(*args, **kwargs) 2022-11-23T03:12:19.3356796Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3356898Z self.run_subtests( 2022-11-23T03:12:19.3357238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3357390Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3357739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3357881Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3358299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3358409Z output = model(*input) 2022-11-23T03:12:19.3358715Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3359010Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3359362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3359523Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3360046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3360159Z _lazy_init(state, module) 2022-11-23T03:12:19.3360496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3360636Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3360959Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3361074Z return func(*args, **kwargs) 2022-11-23T03:12:19.3361441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3361533Z p_assert( 2022-11-23T03:12:19.3361859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3361977Z traceback.print_stack() 2022-11-23T03:12:19.3362097Z File "", line 1, in 2022-11-23T03:12:19.3362297Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3362425Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3362618Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3362924Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3363122Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3363213Z self.run() 2022-11-23T03:12:19.3363398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3363528Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3363840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3363961Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3364301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3364595Z getattr(self, test_name)() 2022-11-23T03:12:19.3364946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3365035Z fn() 2022-11-23T03:12:19.3365439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3365556Z test(self, **param_kwargs) 2022-11-23T03:12:19.3365898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3366015Z return func(*args, **kwargs) 2022-11-23T03:12:19.3366244Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3366348Z self.run_subtests( 2022-11-23T03:12:19.3366688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3366840Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3367192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3367333Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3367898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3368005Z output = model(*input) 2022-11-23T03:12:19.3368306Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3368431Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3368781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3368939Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3369281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3369390Z _lazy_init(state, module) 2022-11-23T03:12:19.3369709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3369845Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3370163Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3370275Z return func(*args, **kwargs) 2022-11-23T03:12:19.3370800Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3370894Z p_assert( 2022-11-23T03:12:19.3371220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3371338Z traceback.print_stack() 2022-11-23T03:12:19.3371451Z File "", line 1, in 2022-11-23T03:12:19.3371651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3371784Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3371975Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3372122Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3372325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3372421Z self.run() 2022-11-23T03:12:19.3372607Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3372744Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3373074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3373197Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3373704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3373987Z getattr(self, test_name)() 2022-11-23T03:12:19.3374333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3374421Z fn() 2022-11-23T03:12:19.3374820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3374941Z test(self, **param_kwargs) 2022-11-23T03:12:19.3375288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3375403Z return func(*args, **kwargs) 2022-11-23T03:12:19.3375631Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3375737Z self.run_subtests( 2022-11-23T03:12:19.3376076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3376228Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3376572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3376716Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3377146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3377257Z output = model(*input) 2022-11-23T03:12:19.3377575Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3377706Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3378061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3378226Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3378583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3378850Z _lazy_init(state, module) 2022-11-23T03:12:19.3379178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3379311Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3379630Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3379741Z return func(*args, **kwargs) 2022-11-23T03:12:19.3380089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3380180Z p_assert( 2022-11-23T03:12:19.3380684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3380802Z traceback.print_stack() 2022-11-23T03:12:19.3380923Z File "", line 1, in 2022-11-23T03:12:19.3381121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3381254Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3381446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3381585Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3381790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3381885Z self.run() 2022-11-23T03:12:19.3382077Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3382214Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3382543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3382669Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3383020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3383128Z getattr(self, test_name)() 2022-11-23T03:12:19.3383476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3383568Z fn() 2022-11-23T03:12:19.3384218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3384351Z test(self, **param_kwargs) 2022-11-23T03:12:19.3384707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3384824Z return func(*args, **kwargs) 2022-11-23T03:12:19.3385044Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3385149Z self.run_subtests( 2022-11-23T03:12:19.3385492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3385647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3385999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3386207Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3386573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3386684Z output = model(*input) 2022-11-23T03:12:19.3386991Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3387125Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3387489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3387652Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3388008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3388118Z _lazy_init(state, module) 2022-11-23T03:12:19.3388457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3388593Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3388920Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3389028Z return func(*args, **kwargs) 2022-11-23T03:12:19.3389398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3389491Z p_assert( 2022-11-23T03:12:19.3389864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3389981Z traceback.print_stack() 2022-11-23T03:12:19.3390209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3390434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3390658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3391030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3391150Z File "", line 1, in 2022-11-23T03:12:19.3391341Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3391470Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3391655Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3391791Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3391986Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3392072Z self.run() 2022-11-23T03:12:19.3392258Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3392391Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3392712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3392835Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3393218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3393336Z getattr(self, test_name)() 2022-11-23T03:12:19.3393677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3393756Z fn() 2022-11-23T03:12:19.3394095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3394205Z test(self, **param_kwargs) 2022-11-23T03:12:19.3394536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3394647Z return func(*args, **kwargs) 2022-11-23T03:12:19.3394867Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3395015Z self.run_subtests( 2022-11-23T03:12:19.3395350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3395491Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3395832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3395969Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3396318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3396424Z output = model(*input) 2022-11-23T03:12:19.3396727Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3396855Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3397205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3397545Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3397904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3398017Z _lazy_init(state, module) 2022-11-23T03:12:19.3398356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3398490Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3398814Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3398929Z return func(*args, **kwargs) 2022-11-23T03:12:19.3399295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3399382Z p_assert( 2022-11-23T03:12:19.3399706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3399832Z traceback.print_stack() 2022-11-23T03:12:19.3399952Z File "", line 1, in 2022-11-23T03:12:19.3400151Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3400284Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3400635Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3400768Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3400963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3401052Z self.run() 2022-11-23T03:12:19.3401237Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3401369Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3401867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3401996Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3402392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3402510Z getattr(self, test_name)() 2022-11-23T03:12:19.3402860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3402949Z fn() 2022-11-23T03:12:19.3403303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3403416Z test(self, **param_kwargs) 2022-11-23T03:12:19.3403761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3403876Z return func(*args, **kwargs) 2022-11-23T03:12:19.3404104Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3404291Z self.run_subtests( 2022-11-23T03:12:19.3404640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3404793Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3405143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3405285Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3405648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3405759Z output = model(*input) 2022-11-23T03:12:19.3406074Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3406200Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3406563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3406738Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3407094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3407206Z _lazy_init(state, module) 2022-11-23T03:12:19.3407542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3407674Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3407999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3408110Z return func(*args, **kwargs) 2022-11-23T03:12:19.3408482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3408578Z p_assert( 2022-11-23T03:12:19.3408906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3409027Z traceback.print_stack() 2022-11-23T03:12:19.3409149Z File "", line 1, in 2022-11-23T03:12:19.3409349Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3409482Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3409670Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3409813Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3410020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3410114Z self.run() 2022-11-23T03:12:19.3410306Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3410441Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3410769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3410934Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3411296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3411412Z getattr(self, test_name)() 2022-11-23T03:12:19.3411758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3411846Z fn() 2022-11-23T03:12:19.3412200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3412313Z test(self, **param_kwargs) 2022-11-23T03:12:19.3412656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3412766Z return func(*args, **kwargs) 2022-11-23T03:12:19.3412996Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3413150Z self.run_subtests( 2022-11-23T03:12:19.3413494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3413647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3413999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3414143Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3414504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3414608Z output = model(*input) 2022-11-23T03:12:19.3414921Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3415051Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3415413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3415585Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3415939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3416051Z _lazy_init(state, module) 2022-11-23T03:12:19.3416389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3416517Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3416849Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3416967Z return func(*args, **kwargs) 2022-11-23T03:12:19.3417336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3417429Z p_assert( 2022-11-23T03:12:19.3417759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3417878Z traceback.print_stack() 2022-11-23T03:12:19.3418001Z File "", line 1, in 2022-11-23T03:12:19.3418196Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3418328Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3418519Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3418662Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3418862Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3418956Z self.run() 2022-11-23T03:12:19.3419149Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3419285Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3419610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3419785Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3420147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3420261Z getattr(self, test_name)() 2022-11-23T03:12:19.3420607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3420696Z fn() 2022-11-23T03:12:19.3421052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3421159Z test(self, **param_kwargs) 2022-11-23T03:12:19.3421504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3421622Z return func(*args, **kwargs) 2022-11-23T03:12:19.3421849Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3421998Z self.run_subtests( 2022-11-23T03:12:19.3422344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3422496Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3422848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3422985Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3423349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3423458Z output = model(*input) 2022-11-23T03:12:19.3423772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3424132Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3424509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3424683Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3425036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3425154Z _lazy_init(state, module) 2022-11-23T03:12:19.3425489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3425619Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3425947Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3426060Z return func(*args, **kwargs) 2022-11-23T03:12:19.3426427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3426520Z p_assert( 2022-11-23T03:12:19.3426852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3426962Z traceback.print_stack() 2022-11-23T03:12:19.3427188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3427413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3427636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3427855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3427976Z File "", line 1, in 2022-11-23T03:12:19.3428173Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3428304Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3428490Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3428638Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3428905Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3429011Z self.run() 2022-11-23T03:12:19.3429203Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3429339Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3429670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3429796Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3430138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3430255Z getattr(self, test_name)() 2022-11-23T03:12:19.3430604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3430692Z fn() 2022-11-23T03:12:19.3431116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3431229Z test(self, **param_kwargs) 2022-11-23T03:12:19.3431573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3431691Z return func(*args, **kwargs) 2022-11-23T03:12:19.3431912Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3432016Z self.run_subtests( 2022-11-23T03:12:19.3432355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3432507Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3432860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3433002Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3433384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3433532Z output = model(*input) 2022-11-23T03:12:19.3433843Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3433975Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3434340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3434506Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3434859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3434969Z _lazy_init(state, module) 2022-11-23T03:12:19.3435307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3435443Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3435767Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3435885Z return func(*args, **kwargs) 2022-11-23T03:12:19.3436253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3436345Z p_assert( 2022-11-23T03:12:19.3436668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3436784Z traceback.print_stack() 2022-11-23T03:12:19.3436903Z File "", line 1, in 2022-11-23T03:12:19.3437103Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3437228Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3437421Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3437566Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3437820Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3437921Z self.run() 2022-11-23T03:12:19.3438114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3438249Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3438573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3438698Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3439047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3439160Z getattr(self, test_name)() 2022-11-23T03:12:19.3439506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3439593Z fn() 2022-11-23T03:12:19.3440016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3440130Z test(self, **param_kwargs) 2022-11-23T03:12:19.3440468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3440584Z return func(*args, **kwargs) 2022-11-23T03:12:19.3440812Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3440915Z self.run_subtests( 2022-11-23T03:12:19.3441253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3441406Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3441755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3441896Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3442261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3442371Z output = model(*input) 2022-11-23T03:12:19.3442685Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3442819Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3443180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3443344Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3443700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3443813Z _lazy_init(state, module) 2022-11-23T03:12:19.3444146Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3444283Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3444611Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3444726Z return func(*args, **kwargs) 2022-11-23T03:12:19.3445096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3445188Z p_assert( 2022-11-23T03:12:19.3445511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3445630Z traceback.print_stack() 2022-11-23T03:12:19.3445742Z File "", line 1, in 2022-11-23T03:12:19.3445941Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3446073Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3446267Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3446412Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3446657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3446759Z self.run() 2022-11-23T03:12:19.3446946Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3447083Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3447412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3447534Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3447883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3447996Z getattr(self, test_name)() 2022-11-23T03:12:19.3448342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3448431Z fn() 2022-11-23T03:12:19.3448834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3448954Z test(self, **param_kwargs) 2022-11-23T03:12:19.3449300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3449414Z return func(*args, **kwargs) 2022-11-23T03:12:19.3449644Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3449749Z self.run_subtests( 2022-11-23T03:12:19.3450089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3450241Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3450584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3450728Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3451099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3451210Z output = model(*input) 2022-11-23T03:12:19.3451528Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3451658Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3452022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3452190Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3452541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3452654Z _lazy_init(state, module) 2022-11-23T03:12:19.3452994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3453131Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3453463Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3453578Z return func(*args, **kwargs) 2022-11-23T03:12:19.3453947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3454040Z p_assert( 2022-11-23T03:12:19.3454355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3454470Z traceback.print_stack() 2022-11-23T03:12:19.3454589Z File "", line 1, in 2022-11-23T03:12:19.3454786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3454920Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3455111Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3455258Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3455509Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3455606Z self.run() 2022-11-23T03:12:19.3455798Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3455933Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3456262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3456384Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3456733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3456847Z getattr(self, test_name)() 2022-11-23T03:12:19.3457188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3457277Z fn() 2022-11-23T03:12:19.3457686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3457797Z test(self, **param_kwargs) 2022-11-23T03:12:19.3458145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3458260Z return func(*args, **kwargs) 2022-11-23T03:12:19.3458489Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3458592Z self.run_subtests( 2022-11-23T03:12:19.3458924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3459077Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3459429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3459571Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3459939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3460051Z output = model(*input) 2022-11-23T03:12:19.3460363Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3460495Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3460850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3461017Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3461370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3461481Z _lazy_init(state, module) 2022-11-23T03:12:19.3461820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3461958Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3462287Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3462401Z return func(*args, **kwargs) 2022-11-23T03:12:19.3462770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3462857Z p_assert( 2022-11-23T03:12:19.3463181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3463296Z traceback.print_stack() 2022-11-23T03:12:19.3463525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3463751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3464202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3464506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3464631Z File "", line 1, in 2022-11-23T03:12:19.3464847Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3464979Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3465171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3465313Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3465515Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3465611Z self.run() 2022-11-23T03:12:19.3465805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3465936Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3466381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3466575Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3466931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3467045Z getattr(self, test_name)() 2022-11-23T03:12:19.3467392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3467481Z fn() 2022-11-23T03:12:19.3467833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3467941Z test(self, **param_kwargs) 2022-11-23T03:12:19.3468288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3468403Z return func(*args, **kwargs) 2022-11-23T03:12:19.3468634Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3468742Z self.run_subtests( 2022-11-23T03:12:19.3469087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3469238Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3469592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3469728Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3470091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3470205Z output = model(*input) 2022-11-23T03:12:19.3470519Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3470651Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3471018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3471192Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3471550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3471656Z _lazy_init(state, module) 2022-11-23T03:12:19.3471995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3472128Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3472456Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3472571Z return func(*args, **kwargs) 2022-11-23T03:12:19.3472938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3473031Z p_assert( 2022-11-23T03:12:19.3473356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3473517Z traceback.print_stack() 2022-11-23T03:12:19.3473646Z File "", line 1, in 2022-11-23T03:12:19.3473844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3473977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3474168Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3474309Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3474510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3474599Z self.run() 2022-11-23T03:12:19.3474793Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3474929Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3475259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3475431Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3475783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3475897Z getattr(self, test_name)() 2022-11-23T03:12:19.3476245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3476328Z fn() 2022-11-23T03:12:19.3476681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3476796Z test(self, **param_kwargs) 2022-11-23T03:12:19.3477141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3477256Z return func(*args, **kwargs) 2022-11-23T03:12:19.3477484Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3477592Z self.run_subtests( 2022-11-23T03:12:19.3477938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3478086Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3478436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3478580Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3478942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3479050Z output = model(*input) 2022-11-23T03:12:19.3479363Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3479495Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3479859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3480027Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3480385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3480497Z _lazy_init(state, module) 2022-11-23T03:12:19.3480835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3480970Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3481296Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3481410Z return func(*args, **kwargs) 2022-11-23T03:12:19.3481775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3481862Z p_assert( 2022-11-23T03:12:19.3482185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3482353Z traceback.print_stack() 2022-11-23T03:12:19.3482480Z File "", line 1, in 2022-11-23T03:12:19.3482679Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3482810Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3483001Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3483143Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3483339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3483435Z self.run() 2022-11-23T03:12:19.3483629Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3483764Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3484094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3484265Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3484620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3484731Z getattr(self, test_name)() 2022-11-23T03:12:19.3485081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3485170Z fn() 2022-11-23T03:12:19.3485522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3485637Z test(self, **param_kwargs) 2022-11-23T03:12:19.3485981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3486095Z return func(*args, **kwargs) 2022-11-23T03:12:19.3486323Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3486425Z self.run_subtests( 2022-11-23T03:12:19.3486768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3486921Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3487275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3487417Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3487779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3487889Z output = model(*input) 2022-11-23T03:12:19.3488202Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3488328Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3488693Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3488866Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3489222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3489334Z _lazy_init(state, module) 2022-11-23T03:12:19.3489724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3489859Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3490189Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3490297Z return func(*args, **kwargs) 2022-11-23T03:12:19.3490666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3490760Z p_assert( 2022-11-23T03:12:19.3491083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3491251Z traceback.print_stack() 2022-11-23T03:12:19.3491377Z File "", line 1, in 2022-11-23T03:12:19.3491575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3491709Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3491899Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3492039Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3492241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3492336Z self.run() 2022-11-23T03:12:19.3492530Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3492666Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3492998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3493162Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3493520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3493639Z getattr(self, test_name)() 2022-11-23T03:12:19.3493984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3494074Z fn() 2022-11-23T03:12:19.3494427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3494539Z test(self, **param_kwargs) 2022-11-23T03:12:19.3494884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3494996Z return func(*args, **kwargs) 2022-11-23T03:12:19.3495224Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3495331Z self.run_subtests( 2022-11-23T03:12:19.3495673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3495828Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3496179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3496321Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3496682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3496787Z output = model(*input) 2022-11-23T03:12:19.3497101Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3497233Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3497594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3497768Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3498124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3498234Z _lazy_init(state, module) 2022-11-23T03:12:19.3498572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3498701Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3499026Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3499144Z return func(*args, **kwargs) 2022-11-23T03:12:19.3499511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3499604Z p_assert( 2022-11-23T03:12:19.3499933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3500095Z traceback.print_stack() 2022-11-23T03:12:19.3500331Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3500550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3500775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3500995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3501115Z File "", line 1, in 2022-11-23T03:12:19.3501315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3501448Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3501639Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3501842Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3502042Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3502137Z self.run() 2022-11-23T03:12:19.3502333Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3502468Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3502802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3502925Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3503275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3503388Z getattr(self, test_name)() 2022-11-23T03:12:19.3503731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3503819Z fn() 2022-11-23T03:12:19.3504408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3504531Z test(self, **param_kwargs) 2022-11-23T03:12:19.3504877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3504991Z return func(*args, **kwargs) 2022-11-23T03:12:19.3505222Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3505327Z self.run_subtests( 2022-11-23T03:12:19.3505667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3505818Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3506169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3506313Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3506683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3506794Z output = model(*input) 2022-11-23T03:12:19.3507108Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3507240Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3507599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3507765Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3508118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3508228Z _lazy_init(state, module) 2022-11-23T03:12:19.3508564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3508701Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3509092Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3509215Z return func(*args, **kwargs) 2022-11-23T03:12:19.3509582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3509679Z p_assert( 2022-11-23T03:12:19.3510003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3510120Z traceback.print_stack() 2022-11-23T03:12:19.3510240Z File "", line 1, in 2022-11-23T03:12:19.3510439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3510571Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3510756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3510962Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3511167Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3511262Z self.run() 2022-11-23T03:12:19.3511453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3511589Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3511919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3512042Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3512385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3512499Z getattr(self, test_name)() 2022-11-23T03:12:19.3512844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3512931Z fn() 2022-11-23T03:12:19.3513285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3513407Z test(self, **param_kwargs) 2022-11-23T03:12:19.3513755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3513872Z return func(*args, **kwargs) 2022-11-23T03:12:19.3514095Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3514199Z self.run_subtests( 2022-11-23T03:12:19.3514540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3514691Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3515042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3515185Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3515554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3515664Z output = model(*input) 2022-11-23T03:12:19.3515974Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3516108Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3516472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3516638Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3516992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3517103Z _lazy_init(state, module) 2022-11-23T03:12:19.3517443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3517582Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3517945Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3518066Z return func(*args, **kwargs) 2022-11-23T03:12:19.3518437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3518530Z p_assert( 2022-11-23T03:12:19.3518853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3518969Z traceback.print_stack() 2022-11-23T03:12:19.3519087Z File "", line 1, in 2022-11-23T03:12:19.3519289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3519415Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3519610Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3519800Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3520007Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3520105Z self.run() 2022-11-23T03:12:19.3520297Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3520436Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3520760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3520884Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3521235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3521351Z getattr(self, test_name)() 2022-11-23T03:12:19.3521699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3521786Z fn() 2022-11-23T03:12:19.3522147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3522260Z test(self, **param_kwargs) 2022-11-23T03:12:19.3522599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3522716Z return func(*args, **kwargs) 2022-11-23T03:12:19.3522946Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3523049Z self.run_subtests( 2022-11-23T03:12:19.3523387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3523539Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3523890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3524032Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3524393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3524504Z output = model(*input) 2022-11-23T03:12:19.3524820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3524954Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3525316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3525482Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3525836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3525950Z _lazy_init(state, module) 2022-11-23T03:12:19.3526281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3526421Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3526795Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3526917Z return func(*args, **kwargs) 2022-11-23T03:12:19.3527286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3527379Z p_assert( 2022-11-23T03:12:19.3527701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3527818Z traceback.print_stack() 2022-11-23T03:12:19.3527930Z File "", line 1, in 2022-11-23T03:12:19.3528128Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3528260Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3528451Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3528653Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3528858Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3528953Z self.run() 2022-11-23T03:12:19.3529449Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3529581Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3529914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3530038Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3530386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3530501Z getattr(self, test_name)() 2022-11-23T03:12:19.3530846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3530933Z fn() 2022-11-23T03:12:19.3531296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3531402Z test(self, **param_kwargs) 2022-11-23T03:12:19.3531745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3531859Z return func(*args, **kwargs) 2022-11-23T03:12:19.3532086Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3532188Z self.run_subtests( 2022-11-23T03:12:19.3532529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3532683Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3533034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3533172Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3533596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3533713Z output = model(*input) 2022-11-23T03:12:19.3534030Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3534161Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3534525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3534691Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3535046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3535151Z _lazy_init(state, module) 2022-11-23T03:12:19.3535489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3535628Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3536002Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3536123Z return func(*args, **kwargs) 2022-11-23T03:12:19.3536491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3536584Z p_assert( 2022-11-23T03:12:19.3536906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3537016Z traceback.print_stack() 2022-11-23T03:12:19.3537246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3537472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3537697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3537968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3538092Z File "", line 1, in 2022-11-23T03:12:19.3538291Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3538424Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3538611Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3538750Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3538956Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3539049Z self.run() 2022-11-23T03:12:19.3539240Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3539375Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3539709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3539829Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3540182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3540295Z getattr(self, test_name)() 2022-11-23T03:12:19.3540640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3540729Z fn() 2022-11-23T03:12:19.3541081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3541194Z test(self, **param_kwargs) 2022-11-23T03:12:19.3541538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3541647Z return func(*args, **kwargs) 2022-11-23T03:12:19.3541876Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3541986Z self.run_subtests( 2022-11-23T03:12:19.3542332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3542486Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3542837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3542981Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3543342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3543459Z output = model(*input) 2022-11-23T03:12:19.3543836Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3544195Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3544565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3544847Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3545210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3545322Z _lazy_init(state, module) 2022-11-23T03:12:19.3545661Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3545787Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3546112Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3546225Z return func(*args, **kwargs) 2022-11-23T03:12:19.3546591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3546684Z p_assert( 2022-11-23T03:12:19.3547006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3547186Z traceback.print_stack() 2022-11-23T03:12:19.3547307Z File "", line 1, in 2022-11-23T03:12:19.3547499Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3547632Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3547824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3547966Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3548168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3548261Z self.run() 2022-11-23T03:12:19.3548453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3548584Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3548914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3549042Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3549393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3549506Z getattr(self, test_name)() 2022-11-23T03:12:19.3549852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3549941Z fn() 2022-11-23T03:12:19.3550294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3550400Z test(self, **param_kwargs) 2022-11-23T03:12:19.3550740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3550855Z return func(*args, **kwargs) 2022-11-23T03:12:19.3551080Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3551188Z self.run_subtests( 2022-11-23T03:12:19.3551532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3551685Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3552034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3552173Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3552540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3552650Z output = model(*input) 2022-11-23T03:12:19.3552966Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3553097Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3553460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3553676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3554042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3554147Z _lazy_init(state, module) 2022-11-23T03:12:19.3554488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3554623Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3554952Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3555068Z return func(*args, **kwargs) 2022-11-23T03:12:19.3555433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3555525Z p_assert( 2022-11-23T03:12:19.3555851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3556010Z traceback.print_stack() 2022-11-23T03:12:19.3556131Z File "", line 1, in 2022-11-23T03:12:19.3556332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3556464Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3556657Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3556798Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3556999Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3557094Z self.run() 2022-11-23T03:12:19.3557279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3557414Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3557743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3557869Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3558219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3558333Z getattr(self, test_name)() 2022-11-23T03:12:19.3558681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3558761Z fn() 2022-11-23T03:12:19.3559114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3559230Z test(self, **param_kwargs) 2022-11-23T03:12:19.3559573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3559688Z return func(*args, **kwargs) 2022-11-23T03:12:19.3559918Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3560025Z self.run_subtests( 2022-11-23T03:12:19.3560367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3560515Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3560869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3561011Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3561373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3561483Z output = model(*input) 2022-11-23T03:12:19.3561797Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3561928Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3562293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3562500Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3562867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3562980Z _lazy_init(state, module) 2022-11-23T03:12:19.3563320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3563452Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3563777Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3563891Z return func(*args, **kwargs) 2022-11-23T03:12:19.3564256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3564349Z p_assert( 2022-11-23T03:12:19.3564665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3564847Z traceback.print_stack() 2022-11-23T03:12:19.3564966Z File "", line 1, in 2022-11-23T03:12:19.3565167Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:12:19.3565300Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:12:19.3565492Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:12:19.3565631Z return self._bootstrap(parent_sentinel) 2022-11-23T03:12:19.3565829Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:12:19.3565922Z self.run() 2022-11-23T03:12:19.3566116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:12:19.3566251Z self._target(*self._args, **self._kwargs) 2022-11-23T03:12:19.3566584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:12:19.3566712Z self.run_test(test_name, pipe) 2022-11-23T03:12:19.3567065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:12:19.3567180Z getattr(self, test_name)() 2022-11-23T03:12:19.3567521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:12:19.3567609Z fn() 2022-11-23T03:12:19.3567961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:12:19.3568074Z test(self, **param_kwargs) 2022-11-23T03:12:19.3568415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:12:19.3568531Z return func(*args, **kwargs) 2022-11-23T03:12:19.3568760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T03:12:19.3568861Z self.run_subtests( 2022-11-23T03:12:19.3569207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T03:12:19.3569360Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T03:12:19.3569711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:12:19.3569855Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:12:19.3570219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:12:19.3570329Z output = model(*input) 2022-11-23T03:12:19.3570644Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:12:19.3570776Z return forward_call(*input, **kwargs) 2022-11-23T03:12:19.3571135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:12:19.3571351Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:12:19.3571718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:12:19.3571830Z _lazy_init(state, module) 2022-11-23T03:12:19.3572168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:12:19.3572301Z handle.init_flat_param_attributes() 2022-11-23T03:12:19.3572628Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:12:19.3572743Z return func(*args, **kwargs) 2022-11-23T03:12:19.3573104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:12:19.3573196Z p_assert( 2022-11-23T03:12:19.3573522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:12:19.3573686Z traceback.print_stack() 2022-11-23T03:12:19.3573919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3574142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3574364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3574585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3574796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3575014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3575232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3575448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3575669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3575883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3576100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3576317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3577061Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3577802Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3578540Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3579265Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3580038Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3580776Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3581505Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3582229Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3582506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3582729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3582950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3583170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3583387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3583604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3583820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3584262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3584485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3584704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3584923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3585138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3585357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3585575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3585792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3586006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3586222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3586439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3586654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3586867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3587079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3587292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3587507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3587719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3588526Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3589269Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3590045Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3590777Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3591559Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3592279Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3592998Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3593717Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3593946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3594170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3594394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3594612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3594836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3595052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3595267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3595481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3595691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3595906Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3596123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3596341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3596555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3596819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3597044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3597260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3597474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3597680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3597894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3598107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3598320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3598583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3598800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3599014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3599744Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3600467Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3601186Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3601905Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3602626Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3603355Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3604071Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3604787Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3605009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3605282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3605508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3605727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3605943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3606158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3606374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3606590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3606807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3607064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3607282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3607500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3607712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3607928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3608140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3608355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3608570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3608777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3608996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3609209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3609422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3609637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3610015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3610223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3610927Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3611631Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3612327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3613020Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3613947Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3614676Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3615394Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3616162Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:12:19.3616387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3616603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3616825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3617044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3617260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3617477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3617859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:12:19.3618991Z dist init r=2, world=4 2022-11-23T03:12:19.3619298Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3619599Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3619887Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3620351Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3620647Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3620939Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3621223Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3621561Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3621859Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3622151Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3622441Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3622732Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:12:19.3622891Z dist init r=0, world=4 2022-11-23T03:12:19.3623367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3623664Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3624336Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3624638Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3624924Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3625233Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3625524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3625814Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3626104Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3626394Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3626684Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3626978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:12:19.3627080Z dist init r=3, world=4 2022-11-23T03:12:19.3627392Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3627697Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3627996Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3628507Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3628806Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3629089Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3629372Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3629652Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3629932Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3630270Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3630551Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3630830Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:12:19.3630928Z dist init r=1, world=4 2022-11-23T03:12:19.3631230Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3631514Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3631804Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3632088Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3632370Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3632651Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3632932Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3633220Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3633740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3634033Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3634323Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3634612Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:12:19.3634705Z ok (11.132s) 2022-11-23T03:12:19.3634733Z 2022-11-23T03:12:19.3635001Z ---------------------------------------------------------------------- 2022-11-23T03:12:19.3635158Z Ran 59 tests in 625.521s 2022-11-23T03:12:19.3635183Z 2022-11-23T03:12:19.3635282Z OK (skipped=5) 2022-11-23T03:12:19.3635305Z 2022-11-23T03:12:19.3635423Z Generating XML reports... 2022-11-23T03:12:19.3635823Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123030151.xml 2022-11-23T03:12:19.3636373Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123030151.xml 2022-11-23T03:12:19.3636755Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123030151.xml 2022-11-23T03:12:19.3637159Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123030151.xml 2022-11-23T03:12:19.3637180Z 2022-11-23T03:12:19.3637660Z ##[endgroup] 2022-11-23T03:12:19.3638151Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_8ncfwqpo) 2022-11-23T03:12:19.3638177Z 2022-11-23T03:12:19.3638200Z 2022-11-23T03:12:19.3638296Z real 10m33.609s 2022-11-23T03:12:19.3638386Z user 32m38.455s 2022-11-23T03:12:19.3638475Z sys 16m49.957s 2022-11-23T03:12:19.3638602Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:12:19.3639054Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_exec_order.py 2022-11-23T03:12:20.8249212Z Ignoring disabled issues: [] 2022-11-23T03:12:20.8778816Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:12:20.8779676Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:12:20.8780023Z Selected tests: 2022-11-23T03:12:20.8780294Z distributed/fsdp/test_fsdp_exec_order.py 2022-11-23T03:12:20.8805205Z Prioritized test from test file changes. 2022-11-23T03:12:20.8805544Z reordering tests for PR: 2022-11-23T03:12:20.8805834Z prioritized: [] 2022-11-23T03:12:20.8806585Z the rest: ['distributed/fsdp/test_fsdp_exec_order.py'] 2022-11-23T03:12:20.8806805Z 2022-11-23T03:12:20.8807340Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:12:20.8808263Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:12:20.8816119Z parallel (file granularity) tests: 2022-11-23T03:12:20.8816388Z 2022-11-23T03:12:20.8816619Z serial (file granularity) tests: 2022-11-23T03:12:20.8816926Z distributed/fsdp/test_fsdp_exec_order.py 2022-11-23T03:12:23.1912327Z Ignoring disabled issues: [] 2022-11-23T03:12:23.5970430Z Running distributed/fsdp/test_fsdp_exec_order.py ... [2022-11-23 03:12:23.596501] 2022-11-23T03:12:23.5973548Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:12:23.597013] 2022-11-23T03:13:06.9224266Z 2022-11-23T03:13:06.9225142Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_exec_order 2022-11-23T03:13:06.9228318Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_exec_order_jk_m2f4m) 2022-11-23T03:13:06.9228721Z 2022-11-23T03:13:06.9228838Z Running tests... 2022-11-23T03:13:06.9229429Z ---------------------------------------------------------------------- 2022-11-23T03:13:06.9230095Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order 2022-11-23T03:13:06.9230618Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9231321Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:13:06.9237842Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29996 2022-11-23T03:13:06.9238415Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29997 2022-11-23T03:13:06.9238874Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29998 2022-11-23T03:13:06.9239327Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29999 2022-11-23T03:13:06.9239975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9240434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9241015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9241487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9242169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9242619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9243194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9243678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9244235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9244700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9245278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9245752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9246309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9246770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9247345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9247791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9248248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9248744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9249232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9249701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9250367Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9251065Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9251749Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9252409Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9252929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9253401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9253871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9254319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9255802Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9256665Z warnings.warn( 2022-11-23T03:13:06.9257818Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9258653Z warnings.warn( 2022-11-23T03:13:06.9259807Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9260689Z warnings.warn( 2022-11-23T03:13:06.9261833Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9262601Z warnings.warn( 2022-11-23T03:13:06.9262851Z dist init r=0, world=4 2022-11-23T03:13:06.9263084Z dist init r=2, world=4 2022-11-23T03:13:06.9263333Z dist init r=1, world=4 2022-11-23T03:13:06.9263579Z dist init r=3, world=4 2022-11-23T03:13:06.9263793Z ok (6.573s) 2022-11-23T03:13:06.9264760Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9266120Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30281 2022-11-23T03:13:06.9266674Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30282 2022-11-23T03:13:06.9267100Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30283 2022-11-23T03:13:06.9267537Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30284 2022-11-23T03:13:06.9268154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9268611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9269170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9269640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9270218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9270643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9271213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9271679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9272255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9272783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9273373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9273834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9274387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9274827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9275395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9275855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9276292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9276870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9277358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9277853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9278494Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9279181Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9279862Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9280536Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9281043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9281512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9281975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9282422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9283874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9284650Z warnings.warn( 2022-11-23T03:13:06.9285793Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9286558Z warnings.warn( 2022-11-23T03:13:06.9287702Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9288447Z warnings.warn( 2022-11-23T03:13:06.9289653Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9290424Z warnings.warn( 2022-11-23T03:13:06.9290677Z dist init r=2, world=4 2022-11-23T03:13:06.9290907Z dist init r=3, world=4 2022-11-23T03:13:06.9291156Z dist init r=0, world=4 2022-11-23T03:13:06.9291401Z dist init r=1, world=4 2022-11-23T03:13:06.9291619Z ok (4.919s) 2022-11-23T03:13:06.9292038Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9292796Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30566 2022-11-23T03:13:06.9293393Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30567 2022-11-23T03:13:06.9293822Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30568 2022-11-23T03:13:06.9294364Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30569 2022-11-23T03:13:06.9294972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9295426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9295979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9296447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9297028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9297461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9298032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9298494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9299064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9299488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9300060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9300517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9301068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9301513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9302081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9302542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9302973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9303466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9304210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9304706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9305353Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9306123Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9306823Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9307505Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9308001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9308469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9308938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9309383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9310654Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9311513Z warnings.warn( 2022-11-23T03:13:06.9312664Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9313431Z warnings.warn( 2022-11-23T03:13:06.9314582Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9315332Z warnings.warn( 2022-11-23T03:13:06.9316466Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9317230Z warnings.warn( 2022-11-23T03:13:06.9317478Z dist init r=2, world=4 2022-11-23T03:13:06.9317710Z dist init r=1, world=4 2022-11-23T03:13:06.9317958Z dist init r=0, world=4 2022-11-23T03:13:06.9318200Z dist init r=3, world=4 2022-11-23T03:13:06.9318415Z ok (4.819s) 2022-11-23T03:13:06.9318833Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9319579Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30867 2022-11-23T03:13:06.9320124Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30868 2022-11-23T03:13:06.9320550Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30869 2022-11-23T03:13:06.9320986Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30870 2022-11-23T03:13:06.9321593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9322101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9322664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9323129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9323713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9324140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9324708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9325168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9325741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9326223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9326793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9327250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9327800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9328237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9328809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9329365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9329801Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9330296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9330784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9331267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9331906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9332590Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9333274Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9333952Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9334449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9334926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9335393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9335844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9337110Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9337942Z warnings.warn( 2022-11-23T03:13:06.9339166Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9339931Z warnings.warn( 2022-11-23T03:13:06.9341075Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9341822Z warnings.warn( 2022-11-23T03:13:06.9343025Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9343782Z warnings.warn( 2022-11-23T03:13:06.9344461Z dist init r=2, world=4 2022-11-23T03:13:06.9344919Z dist init r=1, world=4 2022-11-23T03:13:06.9345371Z dist init r=0, world=4 2022-11-23T03:13:06.9345810Z dist init r=3, world=4 2022-11-23T03:13:06.9346248Z ok (4.919s) 2022-11-23T03:13:06.9346682Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9347447Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31168 2022-11-23T03:13:06.9348000Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31169 2022-11-23T03:13:06.9348429Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31170 2022-11-23T03:13:06.9348865Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31171 2022-11-23T03:13:06.9349471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9349924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9350481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9350948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9351528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9352021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9352588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9353049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9353617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9354034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9354607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9355266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9355823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9356266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9356924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9357395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9357826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9358321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9358810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9359297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9359932Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9360697Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9361377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9362055Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9362549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9363018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9363486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9363934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9365190Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9365970Z warnings.warn( 2022-11-23T03:13:06.9367105Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9367862Z warnings.warn( 2022-11-23T03:13:06.9369008Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9369758Z warnings.warn( 2022-11-23T03:13:06.9370896Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9371654Z warnings.warn( 2022-11-23T03:13:06.9371907Z dist init r=2, world=4 2022-11-23T03:13:06.9372136Z dist init r=0, world=4 2022-11-23T03:13:06.9372432Z dist init r=1, world=4 2022-11-23T03:13:06.9372683Z dist init r=3, world=4 2022-11-23T03:13:06.9372900Z ok (4.919s) 2022-11-23T03:13:06.9373319Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-11-23T03:13:06.9374072Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31469 2022-11-23T03:13:06.9374609Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31470 2022-11-23T03:13:06.9375038Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31471 2022-11-23T03:13:06.9375472Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31472 2022-11-23T03:13:06.9376076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9376587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9377141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9377610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9378183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9378604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9379173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9379635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9380318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9380746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9381316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9381779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9382331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9382772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9383335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9383800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9384446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9384940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9385434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9385919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9386566Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9387260Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9387943Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9388625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9389122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9389672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9390151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9390600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9391864Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9392626Z warnings.warn( 2022-11-23T03:13:06.9393778Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9394610Z warnings.warn( 2022-11-23T03:13:06.9395759Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9396495Z warnings.warn( 2022-11-23T03:13:06.9397640Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9398402Z warnings.warn( 2022-11-23T03:13:06.9398653Z dist init r=0, world=4 2022-11-23T03:13:06.9398887Z dist init r=1, world=4 2022-11-23T03:13:06.9399135Z dist init r=2, world=4 2022-11-23T03:13:06.9399383Z dist init r=3, world=4 2022-11-23T03:13:06.9399600Z ok (4.919s) 2022-11-23T03:13:06.9400079Z test_train_eval_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31770 2022-11-23T03:13:06.9400649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31771 2022-11-23T03:13:06.9401104Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31772 2022-11-23T03:13:06.9401532Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31773 2022-11-23T03:13:06.9402135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9402590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9403148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9403618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9404195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9404639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9405191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9405714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9406300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9406740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9407286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9407747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9408319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9408742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9409311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9409824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9410276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9410757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9411247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9411733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9412385Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9413055Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9413734Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9414422Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9414940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9415391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9415847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9416313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9417552Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9418327Z warnings.warn( 2022-11-23T03:13:06.9419478Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9420246Z warnings.warn( 2022-11-23T03:13:06.9421436Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9422207Z warnings.warn( 2022-11-23T03:13:06.9423322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9424523Z warnings.warn( 2022-11-23T03:13:06.9424973Z dist init r=3, world=4 2022-11-23T03:13:06.9425421Z dist init r=1, world=4 2022-11-23T03:13:06.9425845Z dist init r=0, world=4 2022-11-23T03:13:06.9426421Z dist init r=2, world=4 2022-11-23T03:13:06.9426664Z ok (4.919s) 2022-11-23T03:13:06.9427140Z test_train_eval_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32071 2022-11-23T03:13:06.9427713Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32072 2022-11-23T03:13:06.9428161Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 32073 2022-11-23T03:13:06.9428601Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 32074 2022-11-23T03:13:06.9429210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9429655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9430229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9430683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9431264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9431708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9432284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9432727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9433300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9433739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9434307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9434748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9435325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:06.9435763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:06.9436314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:06.9436778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:06.9437224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:13:06.9437786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:13:06.9438256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:13:06.9438741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:06.9439473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9440171Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9440834Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9441511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:13:06.9442029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:13:06.9442480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:13:06.9442947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:13:06.9443399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:06.9444725Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9445489Z warnings.warn( 2022-11-23T03:13:06.9446621Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9447396Z warnings.warn( 2022-11-23T03:13:06.9448538Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9449296Z warnings.warn( 2022-11-23T03:13:06.9450437Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:06.9451206Z warnings.warn( 2022-11-23T03:13:06.9451441Z dist init r=2, world=4 2022-11-23T03:13:06.9451689Z dist init r=3, world=4 2022-11-23T03:13:06.9451939Z dist init r=1, world=4 2022-11-23T03:13:06.9452166Z dist init r=0, world=4 2022-11-23T03:13:06.9452399Z ok (4.919s) 2022-11-23T03:13:06.9452548Z 2022-11-23T03:13:06.9452822Z ---------------------------------------------------------------------- 2022-11-23T03:13:06.9453135Z Ran 8 tests in 40.908s 2022-11-23T03:13:06.9453296Z 2022-11-23T03:13:06.9453388Z OK 2022-11-23T03:13:06.9453521Z 2022-11-23T03:13:06.9453646Z Generating XML reports... 2022-11-23T03:13:06.9454230Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20221123031225.xml 2022-11-23T03:13:06.9454590Z 2022-11-23T03:13:06.9455108Z ##[endgroup] 2022-11-23T03:13:06.9455805Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_exec_order_jk_m2f4m) 2022-11-23T03:13:06.9456177Z 2022-11-23T03:13:07.3446278Z 2022-11-23T03:13:07.3446736Z real 0m48.906s 2022-11-23T03:13:07.3447051Z user 2m25.945s 2022-11-23T03:13:07.3447292Z sys 1m35.859s 2022-11-23T03:13:07.3447560Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:13:07.3448178Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_flatten_params.py 2022-11-23T03:13:09.6711362Z Ignoring disabled issues: [] 2022-11-23T03:13:09.7238253Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:13:09.7238807Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:13:09.7239152Z Selected tests: 2022-11-23T03:13:09.7239460Z distributed/fsdp/test_fsdp_flatten_params.py 2022-11-23T03:13:09.7262872Z Prioritized test from test file changes. 2022-11-23T03:13:09.7263562Z reordering tests for PR: 2022-11-23T03:13:09.7263841Z prioritized: [] 2022-11-23T03:13:09.7264922Z the rest: ['distributed/fsdp/test_fsdp_flatten_params.py'] 2022-11-23T03:13:09.7265129Z 2022-11-23T03:13:09.7265662Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:13:09.7266596Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:13:09.7270320Z parallel (file granularity) tests: 2022-11-23T03:13:09.7270582Z 2022-11-23T03:13:09.7270812Z serial (file granularity) tests: 2022-11-23T03:13:09.7271132Z distributed/fsdp/test_fsdp_flatten_params.py 2022-11-23T03:13:12.0449818Z Ignoring disabled issues: [] 2022-11-23T03:13:12.4498144Z Running distributed/fsdp/test_fsdp_flatten_params.py ... [2022-11-23 03:13:12.449287] 2022-11-23T03:13:12.4500433Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_flatten_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:13:12.449773] 2022-11-23T03:13:53.7089532Z 2022-11-23T03:13:53.7090485Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_flatten_params 2022-11-23T03:13:53.7091728Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_flatten_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_flatten_params_xeiuol3m) 2022-11-23T03:13:53.7092043Z 2022-11-23T03:13:53.7092166Z Running tests... 2022-11-23T03:13:53.7094569Z ---------------------------------------------------------------------- 2022-11-23T03:13:53.7095056Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params 2022-11-23T03:13:53.7095522Z test_empty_module (__main__.TestFlattenParams) 2022-11-23T03:13:53.7097732Z Tests flattening an empty module (i.e. one without any parameters). ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:13:53.7098429Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32584 2022-11-23T03:13:53.7099791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7100305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7100849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7101370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7101783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7102459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7103059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7103699Z dist init r=0, world=1 2022-11-23T03:13:53.7105074Z ok (5.638s) 2022-11-23T03:13:53.7105725Z test_flat_param_shard_metadata (__main__.TestFlattenParams) 2022-11-23T03:13:53.7106835Z Tests that ``FlatParameter`` shard metadata are computed as expected. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32656 2022-11-23T03:13:53.7107956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7108624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7109707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7110295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7111010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7112478Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7113019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7113386Z dist init r=0, world=1 2022-11-23T03:13:53.7113616Z ok (3.915s) 2022-11-23T03:13:53.7114122Z test_flatten_nothing (__main__.TestFlattenParams) 2022-11-23T03:13:53.7115269Z Tests that constructing a ``FlatParamHandle`` with no parameters ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32728 2022-11-23T03:13:53.7116020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7116482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7117074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7117623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7118005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7118768Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7119193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7119561Z dist init r=0, world=1 2022-11-23T03:13:53.7119877Z ok (3.912s) 2022-11-23T03:13:53.7120107Z test_numel_with_shared_params (__main__.TestFlattenParams) 2022-11-23T03:13:53.7120621Z Tests that numel is preserved after flattening when there are shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32800 2022-11-23T03:13:53.7121316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7121773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7122371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7122815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7123371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7123946Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7124539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7124820Z dist init r=0, world=1 2022-11-23T03:13:53.7125072Z ok (3.912s) 2022-11-23T03:13:53.7125373Z test_numel_without_shared_params (__main__.TestFlattenParams) 2022-11-23T03:13:53.7125915Z Tests that numel is preserved after flattening when there are no shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32872 2022-11-23T03:13:53.7126694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7127163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7127724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7128278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7128659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7129391Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7129837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7130205Z dist init r=0, world=1 2022-11-23T03:13:53.7130582Z ok (3.912s) 2022-11-23T03:13:53.7130832Z test_output_with_shared_params (__main__.TestFlattenParams) 2022-11-23T03:13:53.7131363Z Tests a forward pass after flattening when there are shared parameters ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32944 2022-11-23T03:13:53.7132066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7132500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7133075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7133547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7134085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7134651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7135271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7135552Z dist init r=0, world=1 2022-11-23T03:13:53.7135777Z ok (4.513s) 2022-11-23T03:13:53.7136102Z test_output_without_shared_params (__main__.TestFlattenParams) 2022-11-23T03:13:53.7136631Z Tests a forward pass after flattening when there are no shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33016 2022-11-23T03:13:53.7137356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7137812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7138368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7138932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7139315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7140044Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7140494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7140861Z dist init r=0, world=1 2022-11-23T03:13:53.7141167Z ok (4.513s) 2022-11-23T03:13:53.7141400Z test_partial_flattening (__main__.TestFlattenParams) 2022-11-23T03:13:53.7141890Z Tests flattening some submodules but not others. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33088 2022-11-23T03:13:53.7142566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7142997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7143654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7144461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7144881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7145502Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7146062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7147322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:13:53.7148362Z warnings.warn( 2022-11-23T03:13:53.7148549Z dist init r=0, world=1 2022-11-23T03:13:53.7148777Z ok (4.012s) 2022-11-23T03:13:53.7149116Z test_pnorm_after_step_with_shared_params (__main__.TestFlattenParams) 2022-11-23T03:13:53.7149713Z Tests for parameter Frobenius norm parity after an optimizer step when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33160 2022-11-23T03:13:53.7150350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:13:53.7150808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:13:53.7151391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:13:53.7151871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:13:53.7152314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:13:53.7152985Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:13:53.7153516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:13:53.7153860Z dist init r=0, world=1 2022-11-23T03:13:53.7154114Z ok (4.513s) 2022-11-23T03:13:53.7154267Z 2022-11-23T03:13:53.7154540Z ---------------------------------------------------------------------- 2022-11-23T03:13:53.7154926Z Ran 9 tests in 38.840s 2022-11-23T03:13:53.7155023Z 2022-11-23T03:13:53.7155134Z OK 2022-11-23T03:13:53.7155262Z 2022-11-23T03:13:53.7155483Z Generating XML reports... 2022-11-23T03:13:53.7156067Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params/TEST-TestFlattenParams-20221123031314.xml 2022-11-23T03:13:53.7156374Z 2022-11-23T03:13:53.7156760Z ##[endgroup] 2022-11-23T03:13:53.7157412Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_flatten_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_flatten_params_xeiuol3m) 2022-11-23T03:13:53.7157794Z 2022-11-23T03:13:54.1294869Z 2022-11-23T03:13:54.1295191Z real 0m46.785s 2022-11-23T03:13:54.1295453Z user 1m1.343s 2022-11-23T03:13:54.1295799Z sys 0m56.008s 2022-11-23T03:13:54.1296090Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:13:54.1296603Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_freezing_weights.py 2022-11-23T03:13:56.5254805Z Ignoring disabled issues: [] 2022-11-23T03:13:56.5787104Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:13:56.5788217Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:13:56.5788595Z Selected tests: 2022-11-23T03:13:56.5788914Z distributed/fsdp/test_fsdp_freezing_weights.py 2022-11-23T03:13:56.5814802Z Prioritized test from test file changes. 2022-11-23T03:13:56.5815907Z reordering tests for PR: 2022-11-23T03:13:56.5816565Z prioritized: [] 2022-11-23T03:13:56.5817264Z the rest: ['distributed/fsdp/test_fsdp_freezing_weights.py'] 2022-11-23T03:13:56.5817508Z 2022-11-23T03:13:56.5818053Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:13:56.5818984Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:13:56.5825239Z parallel (file granularity) tests: 2022-11-23T03:13:56.5825582Z 2022-11-23T03:13:56.5825810Z serial (file granularity) tests: 2022-11-23T03:13:56.5826145Z distributed/fsdp/test_fsdp_freezing_weights.py 2022-11-23T03:13:58.8923220Z Ignoring disabled issues: [] 2022-11-23T03:13:59.3125057Z Running distributed/fsdp/test_fsdp_freezing_weights.py ... [2022-11-23 03:13:59.311768] 2022-11-23T03:13:59.3126555Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:13:59.312280] 2022-11-23T03:14:50.6918182Z 2022-11-23T03:14:50.6918850Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_freezing_weights 2022-11-23T03:14:50.6919855Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_freezing_weights (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_freezing_weights_8xi6y_fv) 2022-11-23T03:14:50.6926192Z 2022-11-23T03:14:50.6927041Z Running tests... 2022-11-23T03:14:50.6927664Z ---------------------------------------------------------------------- 2022-11-23T03:14:50.6928301Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights 2022-11-23T03:14:50.6929002Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:14:50.6929633Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33448 2022-11-23T03:14:50.6930068Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33449 2022-11-23T03:14:50.6930505Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33450 2022-11-23T03:14:50.6930954Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33451 2022-11-23T03:14:50.6931558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6932015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6932595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6933074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6933662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6934117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6934692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6935158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6935727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6936171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6936743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6937194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6938096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6938571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6939145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6939608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6940072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.6940568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.6941056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.6941530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.6942321Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6943014Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6943673Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6944837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6945359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.6945831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.6946301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.6946746Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.6947233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6947727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6948192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6948729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6949097Z dist init r=3, world=4 2022-11-23T03:14:50.6949356Z dist init r=0, world=4 2022-11-23T03:14:50.6949597Z dist init r=1, world=4 2022-11-23T03:14:50.6949842Z dist init r=2, world=4 2022-11-23T03:14:50.6950075Z ok (7.684s) 2022-11-23T03:14:50.6950630Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33749 2022-11-23T03:14:50.6951412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33750 2022-11-23T03:14:50.6951867Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33751 2022-11-23T03:14:50.6952318Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33752 2022-11-23T03:14:50.6952914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6953369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6953944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6954393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6954946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6955418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6956117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6956590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6957149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6957592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6958155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6958599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6959169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6959609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6960265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6960706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6961154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.6961645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.6962128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.6962594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.6963245Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6963926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6964594Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6965269Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6965787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.6966255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.6966696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.6967148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.6967624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6968102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6968564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6969038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6970302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.6971073Z warnings.warn( 2022-11-23T03:14:50.6972254Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.6973034Z warnings.warn( 2022-11-23T03:14:50.6974183Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.6975150Z warnings.warn( 2022-11-23T03:14:50.6976293Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.6977112Z warnings.warn( 2022-11-23T03:14:50.6977342Z dist init r=1, world=4 2022-11-23T03:14:50.6977595Z dist init r=3, world=4 2022-11-23T03:14:50.6977840Z dist init r=0, world=4 2022-11-23T03:14:50.6978066Z dist init r=2, world=4 2022-11-23T03:14:50.6978297Z ok (5.921s) 2022-11-23T03:14:50.6978861Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34050 2022-11-23T03:14:50.6979505Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34051 2022-11-23T03:14:50.6979936Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 34052 2022-11-23T03:14:50.6980373Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 34053 2022-11-23T03:14:50.6980984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6981411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6981981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6982444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6983018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6983438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6984229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6984709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6985271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6985706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6986275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6986732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6987285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.6987727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.6988286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.6988834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.6989279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.6990104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.6990588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.6991054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.6991713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6992397Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6993074Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6993959Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.6994453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.6995094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.6995557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.6996004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.6996471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6996953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6997417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6997907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.6998262Z dist init r=3, world=4 2022-11-23T03:14:50.6998511Z dist init r=2, world=4 2022-11-23T03:14:50.6998739Z dist init r=1, world=4 2022-11-23T03:14:50.6998982Z dist init r=0, world=4 2022-11-23T03:14:50.6999371Z ok (5.821s) 2022-11-23T03:14:50.6999896Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34351 2022-11-23T03:14:50.7000709Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34352 2022-11-23T03:14:50.7001153Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 34353 2022-11-23T03:14:50.7001595Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 34354 2022-11-23T03:14:50.7002202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7002653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7003358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7003770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7004319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7004770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7005330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7005756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7010213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7010777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7011369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7011828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7012385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7012826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7013392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7013833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7014285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.7014823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.7015309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.7015791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.7016587Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7017252Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7017907Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7018557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7019037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.7019679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.7020135Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.7020598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.7021053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7021532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7022007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7022464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7023788Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7024864Z warnings.warn( 2022-11-23T03:14:50.7026022Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7026788Z warnings.warn( 2022-11-23T03:14:50.7028136Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7029045Z warnings.warn( 2022-11-23T03:14:50.7030369Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7031131Z warnings.warn( 2022-11-23T03:14:50.7031377Z dist init r=3, world=4 2022-11-23T03:14:50.7031607Z dist init r=0, world=4 2022-11-23T03:14:50.7031857Z dist init r=1, world=4 2022-11-23T03:14:50.7032261Z dist init r=2, world=4 2022-11-23T03:14:50.7032471Z ok (5.821s) 2022-11-23T03:14:50.7033010Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34652 2022-11-23T03:14:50.7033636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34653 2022-11-23T03:14:50.7034067Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 34654 2022-11-23T03:14:50.7034473Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 34655 2022-11-23T03:14:50.7035067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7035685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7036263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7036716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7037298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7037740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7038447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7039086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7039664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7040105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7040661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7041121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7042024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7042466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7043012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7043475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7043924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.7044399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.7044975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.7045510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.7046170Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7046835Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7047670Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7048321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7048876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.7049500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.7049963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.7050427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.7050888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7051369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7051842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7052317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7052810Z dist init r=2, world=4 2022-11-23T03:14:50.7053050Z dist init r=0, world=4 2022-11-23T03:14:50.7053288Z dist init r=1, world=4 2022-11-23T03:14:50.7053506Z dist init r=3, world=4 2022-11-23T03:14:50.7053735Z ok (6.022s) 2022-11-23T03:14:50.7054472Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34953 2022-11-23T03:14:50.7055110Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34954 2022-11-23T03:14:50.7055541Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 34955 2022-11-23T03:14:50.7055980Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 34956 2022-11-23T03:14:50.7056590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7057177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7057730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7058185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7058742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7059151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7059698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7060142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7060678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7061291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7061855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7062305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7062999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7063448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7064223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7064698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7065148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.7065621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.7066110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.7066594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.7067254Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7068071Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7068730Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7069575Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7070092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.7070545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.7070997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.7071467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.7071946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7072409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7073046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7073505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7074906Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7075669Z warnings.warn( 2022-11-23T03:14:50.7076827Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7077579Z warnings.warn( 2022-11-23T03:14:50.7078700Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7079568Z warnings.warn( 2022-11-23T03:14:50.7080915Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7081858Z warnings.warn( 2022-11-23T03:14:50.7082104Z dist init r=1, world=4 2022-11-23T03:14:50.7082350Z dist init r=3, world=4 2022-11-23T03:14:50.7082576Z dist init r=2, world=4 2022-11-23T03:14:50.7082818Z dist init r=0, world=4 2022-11-23T03:14:50.7083051Z ok (6.021s) 2022-11-23T03:14:50.7083596Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35378 2022-11-23T03:14:50.7084401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35379 2022-11-23T03:14:50.7084835Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 35380 2022-11-23T03:14:50.7085259Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 35381 2022-11-23T03:14:50.7085829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7086444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7087016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7087481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7088040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7088486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7089054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7089498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7090072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7090509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7091076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7091518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7092085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7092525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7093096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7093694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7094129Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.7094601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.7095233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.7095715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.7096365Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7097176Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7097903Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7098626Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7099174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.7099807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.7100241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.7100858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.7101332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7101806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7102281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7102750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7103105Z dist init r=2, world=4 2022-11-23T03:14:50.7103495Z dist init r=1, world=4 2022-11-23T03:14:50.7103736Z dist init r=3, world=4 2022-11-23T03:14:50.7104403Z dist init r=0, world=4 2022-11-23T03:14:50.7104625Z ok (5.821s) 2022-11-23T03:14:50.7105184Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35679 2022-11-23T03:14:50.7105829Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35680 2022-11-23T03:14:50.7106261Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 35681 2022-11-23T03:14:50.7106711Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 35682 2022-11-23T03:14:50.7107322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7107767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7108327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7108791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7109526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7109952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7110480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7110928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7111477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7111883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7112429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7113166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7113739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:14:50.7114158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:14:50.7114724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:14:50.7115282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:14:50.7115814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:14:50.7116319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:14:50.7116963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:14:50.7117429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:14:50.7118040Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7118699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7119353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7120210Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:14:50.7120704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:14:50.7121171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:14:50.7121630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:14:50.7122094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:14:50.7122548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7123027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7123551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7124013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:14:50.7125277Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7126049Z warnings.warn( 2022-11-23T03:14:50.7127192Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7127963Z warnings.warn( 2022-11-23T03:14:50.7129256Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7129982Z warnings.warn( 2022-11-23T03:14:50.7131134Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:14:50.7131933Z warnings.warn( 2022-11-23T03:14:50.7132170Z dist init r=1, world=4 2022-11-23T03:14:50.7132418Z dist init r=2, world=4 2022-11-23T03:14:50.7132671Z dist init r=3, world=4 2022-11-23T03:14:50.7132906Z dist init r=0, world=4 2022-11-23T03:14:50.7133293Z ok (5.821s) 2022-11-23T03:14:50.7133441Z 2022-11-23T03:14:50.7133720Z ---------------------------------------------------------------------- 2022-11-23T03:14:50.7134052Z Ran 8 tests in 48.933s 2022-11-23T03:14:50.7134208Z 2022-11-23T03:14:50.7134299Z OK 2022-11-23T03:14:50.7134413Z 2022-11-23T03:14:50.7134535Z Generating XML reports... 2022-11-23T03:14:50.7135156Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20221123031401.xml 2022-11-23T03:14:50.7135529Z 2022-11-23T03:14:50.7135952Z ##[endgroup] 2022-11-23T03:14:50.7136581Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_freezing_weights (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_freezing_weights_8xi6y_fv) 2022-11-23T03:14:50.7136960Z 2022-11-23T03:14:51.0495589Z 2022-11-23T03:14:51.0495874Z real 0m56.920s 2022-11-23T03:14:51.0496144Z user 2m47.585s 2022-11-23T03:14:51.0496396Z sys 1m46.488s 2022-11-23T03:14:51.0496663Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:14:51.0497194Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_fx.py 2022-11-23T03:14:53.4142420Z Ignoring disabled issues: [] 2022-11-23T03:14:53.4672313Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:14:53.4672903Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:14:53.4673257Z Selected tests: 2022-11-23T03:14:53.4673567Z distributed/fsdp/test_fsdp_fx.py 2022-11-23T03:14:53.4697893Z Prioritized test from test file changes. 2022-11-23T03:14:53.4698286Z reordering tests for PR: 2022-11-23T03:14:53.4698558Z prioritized: [] 2022-11-23T03:14:53.4699049Z the rest: ['distributed/fsdp/test_fsdp_fx.py'] 2022-11-23T03:14:53.4699237Z 2022-11-23T03:14:53.4699773Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:14:53.4700714Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:14:53.4706598Z parallel (file granularity) tests: 2022-11-23T03:14:53.4707129Z 2022-11-23T03:14:53.4707365Z serial (file granularity) tests: 2022-11-23T03:14:53.4707667Z distributed/fsdp/test_fsdp_fx.py 2022-11-23T03:14:55.7905915Z Ignoring disabled issues: [] 2022-11-23T03:14:56.2077381Z Running distributed/fsdp/test_fsdp_fx.py ... [2022-11-23 03:14:56.206978] 2022-11-23T03:14:56.2078157Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_fx.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:14:56.207433] 2022-11-23T03:15:04.7707676Z 2022-11-23T03:15:04.7708420Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_fx 2022-11-23T03:15:04.7710515Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_fx (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_fx_2ryj6p1c) 2022-11-23T03:15:04.7711066Z 2022-11-23T03:15:04.7711186Z Running tests... 2022-11-23T03:15:04.7711703Z ---------------------------------------------------------------------- 2022-11-23T03:15:04.7712262Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_fx 2022-11-23T03:15:04.7712727Z test_symbolic_tracing_outputs (__main__.TestSymbolicTracing) 2022-11-23T03:15:04.7713210Z test ``execution_info.module_forward_order`` and ``execution_info.module_to_execution_infos`` ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:15:04.7714146Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36316 2022-11-23T03:15:04.7714649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36317 2022-11-23T03:15:04.7715145Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 36318 2022-11-23T03:15:04.7715612Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 36319 2022-11-23T03:15:04.7716284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:15:04.7716761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:15:04.7717348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:15:04.7717852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:15:04.7718469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:15:04.7718948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:15:04.7719543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:15:04.7720044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:15:04.7720658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:15:04.7721135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:15:04.7721710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:15:04.7722194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:15:04.7722805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:15:04.7723288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:15:04.7723910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:15:04.7724405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:15:04.7724880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:15:04.7725386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:15:04.7725913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:15:04.7726440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:15:04.7727114Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:15:04.7727848Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:15:04.7728579Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:15:04.7729310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:15:04.7729839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:15:04.7730343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:15:04.7730831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:15:04.7731332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:15:04.7731763Z dist init r=3, world=4 2022-11-23T03:15:04.7732026Z dist init r=0, world=4 2022-11-23T03:15:04.7732336Z dist init r=1, world=4 2022-11-23T03:15:04.7732584Z dist init r=2, world=4 2022-11-23T03:15:04.7732833Z ok (6.165s) 2022-11-23T03:15:04.7732988Z 2022-11-23T03:15:04.7733278Z ---------------------------------------------------------------------- 2022-11-23T03:15:04.7733608Z Ran 1 test in 6.165s 2022-11-23T03:15:04.7733777Z 2022-11-23T03:15:04.7733873Z OK 2022-11-23T03:15:04.7734010Z 2022-11-23T03:15:04.7734136Z Generating XML reports... 2022-11-23T03:15:04.7734755Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_fx/TEST-TestSymbolicTracing-20221123031458.xml 2022-11-23T03:15:04.7735105Z 2022-11-23T03:15:04.7735424Z ##[endgroup] 2022-11-23T03:15:04.7736034Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_fx (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_fx_2ryj6p1c) 2022-11-23T03:15:04.7736401Z 2022-11-23T03:15:05.1516330Z 2022-11-23T03:15:05.1517201Z real 0m14.102s 2022-11-23T03:15:05.1517793Z user 0m29.042s 2022-11-23T03:15:05.1518132Z sys 0m20.926s 2022-11-23T03:15:05.1518407Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:15:05.1519002Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_grad_acc.py 2022-11-23T03:15:07.5261771Z Ignoring disabled issues: [] 2022-11-23T03:15:07.5797247Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:15:07.5798405Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:15:07.5798773Z Selected tests: 2022-11-23T03:15:07.5799070Z distributed/fsdp/test_fsdp_grad_acc.py 2022-11-23T03:15:07.5822371Z Prioritized test from test file changes. 2022-11-23T03:15:07.5823011Z reordering tests for PR: 2022-11-23T03:15:07.5823635Z prioritized: [] 2022-11-23T03:15:07.5824997Z the rest: ['distributed/fsdp/test_fsdp_grad_acc.py'] 2022-11-23T03:15:07.5825225Z 2022-11-23T03:15:07.5825772Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:15:07.5826720Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:15:07.5830370Z parallel (file granularity) tests: 2022-11-23T03:15:07.5830980Z 2022-11-23T03:15:07.5831440Z serial (file granularity) tests: 2022-11-23T03:15:07.5831755Z distributed/fsdp/test_fsdp_grad_acc.py 2022-11-23T03:15:09.9052494Z Ignoring disabled issues: [] 2022-11-23T03:15:10.3127018Z Running distributed/fsdp/test_fsdp_grad_acc.py ... [2022-11-23 03:15:10.312086] 2022-11-23T03:15:10.3128104Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_grad_acc.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:15:10.312522] 2022-11-23T03:16:14.0057776Z 2022-11-23T03:16:14.0058534Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_grad_acc 2022-11-23T03:16:14.0060207Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_grad_acc (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_grad_acc_84669szp) 2022-11-23T03:16:14.0061614Z 2022-11-23T03:16:14.0061826Z Running tests... 2022-11-23T03:16:14.0065870Z ---------------------------------------------------------------------- 2022-11-23T03:16:14.0067179Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc 2022-11-23T03:16:14.0067893Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0068606Z Tests gradient accumulation. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:16:14.0069337Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36813 2022-11-23T03:16:14.0070074Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36814 2022-11-23T03:16:14.0072585Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 36815 2022-11-23T03:16:14.0073514Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 36816 2022-11-23T03:16:14.0074787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0075661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0076831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0077676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0078792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0079547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0080437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0081205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0082105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0082591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0083244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0083700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0084363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0084839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0085426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0085879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0086350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0086868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0087349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0087859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0088534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0089246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0089918Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0090599Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0091114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0091588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0092038Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0092500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0093890Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0094801Z warnings.warn( 2022-11-23T03:16:14.0095950Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0096720Z warnings.warn( 2022-11-23T03:16:14.0097853Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0098615Z warnings.warn( 2022-11-23T03:16:14.0099754Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0100523Z warnings.warn( 2022-11-23T03:16:14.0100780Z dist init r=0, world=4 2022-11-23T03:16:14.0101015Z dist init r=1, world=4 2022-11-23T03:16:14.0101267Z dist init r=2, world=4 2022-11-23T03:16:14.0101513Z dist init r=3, world=4 2022-11-23T03:16:14.0101730Z ok (7.457s) 2022-11-23T03:16:14.0102259Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0102927Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37114 2022-11-23T03:16:14.0103420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37115 2022-11-23T03:16:14.0104244Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 37116 2022-11-23T03:16:14.0104719Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 37117 2022-11-23T03:16:14.0105348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0105784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0106363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0106835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0107416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0107841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0108416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0108884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0109447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0110103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0110686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0111148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0111701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0112148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0112719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0113183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0113617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0114115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0114611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0115084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0115745Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0116431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0117107Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0117766Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0118288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0118764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0119230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0119738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0120997Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0121771Z warnings.warn( 2022-11-23T03:16:14.0122921Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0123681Z warnings.warn( 2022-11-23T03:16:14.0124806Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0125639Z warnings.warn( 2022-11-23T03:16:14.0126832Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0127598Z warnings.warn( 2022-11-23T03:16:14.0127848Z dist init r=3, world=4 2022-11-23T03:16:14.0128080Z dist init r=0, world=4 2022-11-23T03:16:14.0128325Z dist init r=2, world=4 2022-11-23T03:16:14.0128577Z dist init r=1, world=4 2022-11-23T03:16:14.0128793Z ok (5.521s) 2022-11-23T03:16:14.0129325Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-11-23T03:16:14.0130000Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37415 2022-11-23T03:16:14.0130491Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37416 2022-11-23T03:16:14.0130920Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 37417 2022-11-23T03:16:14.0131360Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 37418 2022-11-23T03:16:14.0131971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0132407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0132982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0133451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0134034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0134465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0135037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0135499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0136075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0136499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0137068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0137529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0138082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0138703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0139278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0139740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0140174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0140668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0141154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0141639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0142278Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0143090Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0143778Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0144751Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0145248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0145718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0146182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0146629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0147887Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0148663Z warnings.warn( 2022-11-23T03:16:14.0149813Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0150583Z warnings.warn( 2022-11-23T03:16:14.0151722Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0152466Z warnings.warn( 2022-11-23T03:16:14.0153606Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0154360Z warnings.warn( 2022-11-23T03:16:14.0154610Z dist init r=3, world=4 2022-11-23T03:16:14.0154847Z dist init r=0, world=4 2022-11-23T03:16:14.0155095Z dist init r=1, world=4 2022-11-23T03:16:14.0155340Z dist init r=2, world=4 2022-11-23T03:16:14.0155558Z ok (5.521s) 2022-11-23T03:16:14.0156087Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0156750Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37716 2022-11-23T03:16:14.0157242Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37717 2022-11-23T03:16:14.0157731Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 37718 2022-11-23T03:16:14.0158169Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 37719 2022-11-23T03:16:14.0158955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0159416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0159977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0160448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0161024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0161450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0162022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0162487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0163063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0163488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0164054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0164518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0165091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0165508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0166073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0166535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0166966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0167471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0167960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0168445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0169085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0169773Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0170449Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0171122Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0171626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0172102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0172566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0173016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0173370Z dist init r=3, world=4 2022-11-23T03:16:14.0173620Z dist init r=0, world=4 2022-11-23T03:16:14.0173846Z dist init r=2, world=4 2022-11-23T03:16:14.0174095Z dist init r=1, world=4 2022-11-23T03:16:14.0174330Z ok (4.418s) 2022-11-23T03:16:14.0174862Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0175621Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38001 2022-11-23T03:16:14.0176121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38002 2022-11-23T03:16:14.0176566Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 38003 2022-11-23T03:16:14.0177011Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 38004 2022-11-23T03:16:14.0177604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0178055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0178628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0179082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0179655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0180108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0180679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0181122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0181696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0182138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0182685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0183148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0183719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0184375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0184940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0185399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0185848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0186343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0186810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0187299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0187951Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0188629Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0189320Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0190002Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0190520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0190972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0191431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0191896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0192253Z dist init r=3, world=4 2022-11-23T03:16:14.0192628Z dist init r=0, world=4 2022-11-23T03:16:14.0192877Z dist init r=1, world=4 2022-11-23T03:16:14.0193187Z dist init r=2, world=4 2022-11-23T03:16:14.0193414Z ok (4.317s) 2022-11-23T03:16:14.0193955Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-11-23T03:16:14.0194624Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38286 2022-11-23T03:16:14.0195160Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38287 2022-11-23T03:16:14.0195615Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 38288 2022-11-23T03:16:14.0196054Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 38289 2022-11-23T03:16:14.0196667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0197107Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0197680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0198148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0198725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0199150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0199717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0200178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0200727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0201180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0201752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0202209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0202764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0203210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0203784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0204228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0204680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0205173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0205666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0206138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0206787Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0207471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0208148Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0208804Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0209324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0209907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0210376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0210822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0211171Z dist init r=3, world=4 2022-11-23T03:16:14.0211424Z dist init r=0, world=4 2022-11-23T03:16:14.0211653Z dist init r=1, world=4 2022-11-23T03:16:14.0211901Z dist init r=2, world=4 2022-11-23T03:16:14.0212135Z ok (4.318s) 2022-11-23T03:16:14.0212649Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0213312Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38571 2022-11-23T03:16:14.0213804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38572 2022-11-23T03:16:14.0214252Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 38573 2022-11-23T03:16:14.0214679Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 38574 2022-11-23T03:16:14.0215291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0215739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0216292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0216762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0217340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0217786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0218342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0218811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0219382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0219888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0220441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0220902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0221467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0221885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0222458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0222918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0223367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0223842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0224608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0225097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0225737Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0226423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0227254Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0227935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0228431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0228899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0229358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0229825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0231076Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0231844Z warnings.warn( 2022-11-23T03:16:14.0232993Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0233766Z warnings.warn( 2022-11-23T03:16:14.0234904Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0235669Z warnings.warn( 2022-11-23T03:16:14.0236793Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0237554Z warnings.warn( 2022-11-23T03:16:14.0237802Z dist init r=2, world=4 2022-11-23T03:16:14.0238061Z dist init r=3, world=4 2022-11-23T03:16:14.0238289Z dist init r=1, world=4 2022-11-23T03:16:14.0238536Z dist init r=0, world=4 2022-11-23T03:16:14.0238776Z ok (5.621s) 2022-11-23T03:16:14.0239286Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0239950Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38872 2022-11-23T03:16:14.0240440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38873 2022-11-23T03:16:14.0240884Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 38874 2022-11-23T03:16:14.0241307Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 38875 2022-11-23T03:16:14.0241906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0242456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0243064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0243544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0244125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0244571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0245124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0245593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0246166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0246614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0247169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0247633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0248206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0248625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0249189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0249645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0250094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0250570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0251062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0251547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0252198Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0252864Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0253548Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0254231Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0254747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0255203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0255664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0256131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0257373Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0258196Z warnings.warn( 2022-11-23T03:16:14.0259413Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0260228Z warnings.warn( 2022-11-23T03:16:14.0261369Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0262132Z warnings.warn( 2022-11-23T03:16:14.0263253Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0264227Z warnings.warn( 2022-11-23T03:16:14.0264481Z dist init r=0, world=4 2022-11-23T03:16:14.0264736Z dist init r=2, world=4 2022-11-23T03:16:14.0264964Z dist init r=3, world=4 2022-11-23T03:16:14.0265215Z dist init r=1, world=4 2022-11-23T03:16:14.0265451Z ok (5.420s) 2022-11-23T03:16:14.0265965Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-11-23T03:16:14.0266634Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39173 2022-11-23T03:16:14.0267131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39174 2022-11-23T03:16:14.0267576Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39175 2022-11-23T03:16:14.0268004Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 39176 2022-11-23T03:16:14.0268614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0269065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0269642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0270095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0270677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0271130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0271685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0272151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0272729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0273176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0273725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0274190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0274766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0275283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0275922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0276393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0276844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0277319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0277806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0278288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0278938Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0279609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0280295Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0280980Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0281498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0281953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0282411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0282874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0284145Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0284904Z warnings.warn( 2022-11-23T03:16:14.0286055Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0286819Z warnings.warn( 2022-11-23T03:16:14.0287962Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0288731Z warnings.warn( 2022-11-23T03:16:14.0289852Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:16:14.0290610Z warnings.warn( 2022-11-23T03:16:14.0290922Z dist init r=2, world=4 2022-11-23T03:16:14.0291174Z dist init r=0, world=4 2022-11-23T03:16:14.0291455Z dist init r=3, world=4 2022-11-23T03:16:14.0291705Z dist init r=1, world=4 2022-11-23T03:16:14.0291943Z ok (5.623s) 2022-11-23T03:16:14.0292450Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0293114Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39474 2022-11-23T03:16:14.0293609Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39475 2022-11-23T03:16:14.0294052Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39476 2022-11-23T03:16:14.0294474Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 39477 2022-11-23T03:16:14.0295086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0295540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0296115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0296569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0297144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0297584Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0298135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0298601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0299212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0299660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0300215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0300681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0301255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0301696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0302244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0302705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0303154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0303631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0304386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0304927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0305583Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0306256Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0306934Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0307615Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0308232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0308743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0309207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0309674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0310010Z dist init r=3, world=4 2022-11-23T03:16:14.0310261Z dist init r=0, world=4 2022-11-23T03:16:14.0310509Z dist init r=1, world=4 2022-11-23T03:16:14.0310740Z dist init r=2, world=4 2022-11-23T03:16:14.0310977Z ok (4.318s) 2022-11-23T03:16:14.0311502Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-11-23T03:16:14.0312162Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39759 2022-11-23T03:16:14.0312642Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39760 2022-11-23T03:16:14.0313086Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39761 2022-11-23T03:16:14.0313528Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 39762 2022-11-23T03:16:14.0314145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0314585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0315161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0315631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0316187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0316636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0317208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0317672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0318226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0318671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0319235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0319834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0320414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0320861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0321427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0321871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0322324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0322820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0323314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0323784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0324434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0325125Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0325905Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0326598Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0327113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0327579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0328026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0328480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0328836Z dist init r=3, world=4 2022-11-23T03:16:14.0329090Z dist init r=0, world=4 2022-11-23T03:16:14.0329328Z dist init r=1, world=4 2022-11-23T03:16:14.0329578Z dist init r=2, world=4 2022-11-23T03:16:14.0329814Z ok (4.318s) 2022-11-23T03:16:14.0330332Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-11-23T03:16:14.0331004Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40044 2022-11-23T03:16:14.0331493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40045 2022-11-23T03:16:14.0331918Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40046 2022-11-23T03:16:14.0332364Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40047 2022-11-23T03:16:14.0332964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0333416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0333977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0334447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0335021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0335469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0336022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0336485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0337059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0337484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0338056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0338515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0339082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:14.0339499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:14.0340069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:14.0340527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:14.0340958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:14.0341455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:14.0342003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:14.0342561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:14.0343208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0344084Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0344786Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0345463Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:14.0345964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:14.0346436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:14.0346907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:14.0347374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:14.0347709Z dist init r=3, world=4 2022-11-23T03:16:14.0347966Z dist init r=0, world=4 2022-11-23T03:16:14.0348285Z dist init r=2, world=4 2022-11-23T03:16:14.0348512Z dist init r=1, world=4 2022-11-23T03:16:14.0348750Z ok (4.419s) 2022-11-23T03:16:14.0348900Z 2022-11-23T03:16:14.0349176Z ---------------------------------------------------------------------- 2022-11-23T03:16:14.0349495Z Ran 12 tests in 61.273s 2022-11-23T03:16:14.0349660Z 2022-11-23T03:16:14.0349754Z OK 2022-11-23T03:16:14.0349888Z 2022-11-23T03:16:14.0350014Z Generating XML reports... 2022-11-23T03:16:14.0350576Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc/TEST-TestGradAcc-20221123031512.xml 2022-11-23T03:16:14.0350914Z 2022-11-23T03:16:14.0351458Z ##[endgroup] 2022-11-23T03:16:14.0352070Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_grad_acc (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_grad_acc_84669szp) 2022-11-23T03:16:14.0352425Z 2022-11-23T03:16:14.4212318Z 2022-11-23T03:16:14.4212653Z real 1m9.269s 2022-11-23T03:16:14.4212970Z user 3m39.609s 2022-11-23T03:16:14.4213224Z sys 2m18.297s 2022-11-23T03:16:14.4213496Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:16:14.4214046Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_ignored_modules.py 2022-11-23T03:16:16.7577913Z Ignoring disabled issues: [] 2022-11-23T03:16:16.8111572Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:16:16.8112147Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:16:16.8112525Z Selected tests: 2022-11-23T03:16:16.8112812Z distributed/fsdp/test_fsdp_ignored_modules.py 2022-11-23T03:16:16.8137862Z Prioritized test from test file changes. 2022-11-23T03:16:16.8138318Z reordering tests for PR: 2022-11-23T03:16:16.8138892Z prioritized: [] 2022-11-23T03:16:16.8139690Z the rest: ['distributed/fsdp/test_fsdp_ignored_modules.py'] 2022-11-23T03:16:16.8139918Z 2022-11-23T03:16:16.8140447Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:16:16.8141382Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:16:16.8146900Z parallel (file granularity) tests: 2022-11-23T03:16:16.8147480Z 2022-11-23T03:16:16.8147988Z serial (file granularity) tests: 2022-11-23T03:16:16.8148401Z distributed/fsdp/test_fsdp_ignored_modules.py 2022-11-23T03:16:19.1465649Z Ignoring disabled issues: [] 2022-11-23T03:16:19.5501007Z Running distributed/fsdp/test_fsdp_ignored_modules.py ... [2022-11-23 03:16:19.549372] 2022-11-23T03:16:19.5501878Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:16:19.549871] 2022-11-23T03:16:47.8228008Z 2022-11-23T03:16:47.8228562Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_ignored_modules 2022-11-23T03:16:47.8237557Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_ignored_modules (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_ignored_modules_ftkq7xgr) 2022-11-23T03:16:47.8237957Z 2022-11-23T03:16:47.8238073Z Running tests... 2022-11-23T03:16:47.8238618Z ---------------------------------------------------------------------- 2022-11-23T03:16:47.8239209Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules 2022-11-23T03:16:47.8239782Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_False (__main__.TestFSDPIgnoredModules) 2022-11-23T03:16:47.8240263Z Tests ignoring different modules across ranks. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:16:47.8240732Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40541 2022-11-23T03:16:47.8241179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40542 2022-11-23T03:16:47.8241644Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40543 2022-11-23T03:16:47.8242070Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40544 2022-11-23T03:16:47.8242680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8243142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8243703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8244181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8244765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8245211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8245766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8246229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8246809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8247259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8247845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8248307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8248890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8249356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8249908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8250371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8250829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:47.8251327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:47.8251801Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:47.8252569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:47.8253333Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8254006Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8254687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8255368Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8255886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:47.8256339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:47.8256802Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:47.8257265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:47.8257619Z dist init r=0, world=4 2022-11-23T03:16:47.8257854Z dist init r=1, world=4 2022-11-23T03:16:47.8258103Z dist init r=2, world=4 2022-11-23T03:16:47.8258349Z dist init r=3, world=4 2022-11-23T03:16:47.8258566Z ok (6.647s) 2022-11-23T03:16:47.8276729Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_True (__main__.TestFSDPIgnoredModules) 2022-11-23T03:16:47.8277711Z Tests ignoring different modules across ranks. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40842 2022-11-23T03:16:47.8278546Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40843 2022-11-23T03:16:47.8279388Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40844 2022-11-23T03:16:47.8280156Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40845 2022-11-23T03:16:47.8281198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8281943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8282913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8283382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8283947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8284416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8285005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8285466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8286026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8286481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8287114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8287578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8288147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8288572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8289138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8289597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8290045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:47.8290754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:47.8291254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:47.8291745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:47.8292381Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8293064Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8293737Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8294490Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8294999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:47.8295462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:47.8295918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:47.8296384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:47.8296717Z dist init r=3, world=4 2022-11-23T03:16:47.8296968Z dist init r=1, world=4 2022-11-23T03:16:47.8297218Z dist init r=0, world=4 2022-11-23T03:16:47.8297444Z dist init r=2, world=4 2022-11-23T03:16:47.8297676Z ok (4.919s) 2022-11-23T03:16:47.8297994Z test_ignored_modules_invalid (__main__.TestFSDPIgnoredModules) 2022-11-23T03:16:47.8298492Z Tests that passing an FSDP module as an ignored module or the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41143 2022-11-23T03:16:47.8299021Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41144 2022-11-23T03:16:47.8299466Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41145 2022-11-23T03:16:47.8299907Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41146 2022-11-23T03:16:47.8300501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8300947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8301514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8302033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8302608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8303045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8303618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8304439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8305022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8305461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8306021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8306462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8307032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8307459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8308222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8308730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8309199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:47.8309720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:47.8310223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:47.8310747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:47.8311438Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8312162Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8312876Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8313599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8314161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:47.8314639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:47.8315138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:47.8315631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:47.8316004Z dist init r=3, world=4 2022-11-23T03:16:47.8316245Z dist init r=0, world=4 2022-11-23T03:16:47.8316506Z dist init r=2, world=4 2022-11-23T03:16:47.8316770Z dist init r=1, world=4 2022-11-23T03:16:47.8316998Z ok (4.317s) 2022-11-23T03:16:47.8317333Z test_ignored_modules_nested (__main__.TestFSDPIgnoredModules) 2022-11-23T03:16:47.8317881Z Tests that passing a module with nested FSDP modules does not ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41428 2022-11-23T03:16:47.8318423Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41429 2022-11-23T03:16:47.8318902Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41430 2022-11-23T03:16:47.8319374Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41431 2022-11-23T03:16:47.8320018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8320471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8321070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8321564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8322176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8322628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8323234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8323733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8324328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8324815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8325473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8326014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8326630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8327100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8327735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8328182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8328636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:47.8329131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:47.8329617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:47.8330089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:47.8330741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8331425Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8332107Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8332769Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8333292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:47.8333757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:47.8334221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:47.8334669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:47.8335171Z dist init r=1, world=4 2022-11-23T03:16:47.8335428Z dist init r=0, world=4 2022-11-23T03:16:47.8335663Z dist init r=3, world=4 2022-11-23T03:16:47.8335916Z dist init r=2, world=4 2022-11-23T03:16:47.8336160Z ok (4.819s) 2022-11-23T03:16:47.8336478Z test_ignored_modules_transformer (__main__.TestFSDPIgnoredModules) 2022-11-23T03:16:47.8337136Z Tests that ignored modules' parameters are not flattened for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41729 2022-11-23T03:16:47.8337669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41730 2022-11-23T03:16:47.8338096Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41731 2022-11-23T03:16:47.8338541Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41732 2022-11-23T03:16:47.8339139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8339598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8340150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8340609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8341179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8341630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8342180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8342640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8343208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8343693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8344600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:16:47.8345056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:16:47.8345629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8346075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8346655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:16:47.8347112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:16:47.8347541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:16:47.8348047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:16:47.8348553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:16:47.8349048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:16:47.8349680Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8350361Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8351036Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8351699Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:16:47.8352197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:16:47.8352669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:16:47.8353131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:16:47.8353591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:16:47.8353929Z dist init r=3, world=4 2022-11-23T03:16:47.8354178Z dist init r=0, world=4 2022-11-23T03:16:47.8354426Z dist init r=2, world=4 2022-11-23T03:16:47.8354652Z dist init r=1, world=4 2022-11-23T03:16:47.8354885Z ok (5.120s) 2022-11-23T03:16:47.8355033Z 2022-11-23T03:16:47.8355307Z ---------------------------------------------------------------------- 2022-11-23T03:16:47.8355618Z Ran 5 tests in 25.822s 2022-11-23T03:16:47.8355775Z 2022-11-23T03:16:47.8355866Z OK 2022-11-23T03:16:47.8355998Z 2022-11-23T03:16:47.8356123Z Generating XML reports... 2022-11-23T03:16:47.8356744Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20221123031621.xml 2022-11-23T03:16:47.8357124Z 2022-11-23T03:16:47.8357473Z ##[endgroup] 2022-11-23T03:16:47.8358105Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_ignored_modules (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_ignored_modules_ftkq7xgr) 2022-11-23T03:16:47.8358480Z 2022-11-23T03:16:48.2264900Z 2022-11-23T03:16:48.2265310Z real 0m33.805s 2022-11-23T03:16:48.2265557Z user 1m36.513s 2022-11-23T03:16:48.2266659Z sys 1m8.081s 2022-11-23T03:16:48.2267076Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:16:48.2267715Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_input.py 2022-11-23T03:16:50.5840235Z Ignoring disabled issues: [] 2022-11-23T03:16:50.6367159Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:16:50.6368210Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:16:50.6368679Z Selected tests: 2022-11-23T03:16:50.6368978Z distributed/fsdp/test_fsdp_input.py 2022-11-23T03:16:50.6396980Z Prioritized test from test file changes. 2022-11-23T03:16:50.6397361Z reordering tests for PR: 2022-11-23T03:16:50.6397657Z prioritized: [] 2022-11-23T03:16:50.6398439Z the rest: ['distributed/fsdp/test_fsdp_input.py'] 2022-11-23T03:16:50.6398710Z 2022-11-23T03:16:50.6399184Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:16:50.6400143Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:16:50.6404581Z parallel (file granularity) tests: 2022-11-23T03:16:50.6404961Z 2022-11-23T03:16:50.6405247Z serial (file granularity) tests: 2022-11-23T03:16:50.6405557Z distributed/fsdp/test_fsdp_input.py 2022-11-23T03:16:52.9653093Z Ignoring disabled issues: [] 2022-11-23T03:16:53.4068095Z Running distributed/fsdp/test_fsdp_input.py ... [2022-11-23 03:16:53.406247] 2022-11-23T03:16:53.4069574Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_input.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:16:53.406725] 2022-11-23T03:17:06.4478968Z 2022-11-23T03:17:06.4479638Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_input 2022-11-23T03:17:06.4480930Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_ux3lsah_) 2022-11-23T03:17:06.4481300Z 2022-11-23T03:17:06.4481520Z Running tests... 2022-11-23T03:17:06.4482003Z ---------------------------------------------------------------------- 2022-11-23T03:17:06.4482549Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_input 2022-11-23T03:17:06.4483086Z test_input_type_dict (__main__.TestInput) 2022-11-23T03:17:06.4483537Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:17:06.4484032Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42242 2022-11-23T03:17:06.4484658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:06.4485130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:06.4485721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:06.4486181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:06.4486654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:17:06.4487332Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:17:06.4487867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:17:06.4489116Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:06.4489922Z warnings.warn( 2022-11-23T03:17:06.4490160Z dist init r=0, world=1 2022-11-23T03:17:06.4490417Z ok (6.181s) 2022-11-23T03:17:06.4490703Z test_input_type_list (__main__.TestInput) 2022-11-23T03:17:06.4491160Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42318 2022-11-23T03:17:06.4492334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:06.4492807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:06.4493411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:06.4493925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:06.4494416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:17:06.4495123Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:17:06.4495665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:17:06.4497028Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:06.4497874Z warnings.warn( 2022-11-23T03:17:06.4498153Z dist init r=0, world=1 2022-11-23T03:17:06.4498392Z ok (4.414s) 2022-11-23T03:17:06.4498554Z 2022-11-23T03:17:06.4498849Z ---------------------------------------------------------------------- 2022-11-23T03:17:06.4499214Z Ran 2 tests in 10.595s 2022-11-23T03:17:06.4499389Z 2022-11-23T03:17:06.4499465Z OK 2022-11-23T03:17:06.4499607Z 2022-11-23T03:17:06.4499738Z Generating XML reports... 2022-11-23T03:17:06.4500339Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123031655.xml 2022-11-23T03:17:06.4500698Z 2022-11-23T03:17:06.4501016Z ##[endgroup] 2022-11-23T03:17:06.4501652Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_ux3lsah_) 2022-11-23T03:17:06.4502043Z 2022-11-23T03:17:06.8038527Z 2022-11-23T03:17:06.8039146Z real 0m18.577s 2022-11-23T03:17:06.8039420Z user 0m26.474s 2022-11-23T03:17:06.8039639Z sys 0m25.106s 2022-11-23T03:17:06.8039926Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:17:06.8040560Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_memory.py 2022-11-23T03:17:09.1407784Z Ignoring disabled issues: [] 2022-11-23T03:17:09.1934986Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:17:09.1935565Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:17:09.1935953Z Selected tests: 2022-11-23T03:17:09.1936223Z distributed/fsdp/test_fsdp_memory.py 2022-11-23T03:17:09.1961111Z Prioritized test from test file changes. 2022-11-23T03:17:09.1962371Z reordering tests for PR: 2022-11-23T03:17:09.1962733Z prioritized: [] 2022-11-23T03:17:09.1963277Z the rest: ['distributed/fsdp/test_fsdp_memory.py'] 2022-11-23T03:17:09.1963512Z 2022-11-23T03:17:09.1963975Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:17:09.1964935Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:17:09.1969303Z parallel (file granularity) tests: 2022-11-23T03:17:09.1969587Z 2022-11-23T03:17:09.1969833Z serial (file granularity) tests: 2022-11-23T03:17:09.1970168Z distributed/fsdp/test_fsdp_memory.py 2022-11-23T03:17:11.5269159Z Ignoring disabled issues: [] 2022-11-23T03:17:11.9417876Z Running distributed/fsdp/test_fsdp_memory.py ... [2022-11-23 03:17:11.941299] 2022-11-23T03:17:11.9420150Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_memory.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:17:11.941703] 2022-11-23T03:17:31.4595942Z 2022-11-23T03:17:31.4596580Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_memory 2022-11-23T03:17:31.4601200Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_memory (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_memory_h6zk11qn) 2022-11-23T03:17:31.4601627Z 2022-11-23T03:17:31.4601863Z Running tests... 2022-11-23T03:17:31.4602484Z ---------------------------------------------------------------------- 2022-11-23T03:17:31.4603239Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_memory 2022-11-23T03:17:31.4603810Z test_fsdp_memory_ckpt_ckpt (__main__.TestFSDPMemory) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:17:31.4604418Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42606 2022-11-23T03:17:31.4605022Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42607 2022-11-23T03:17:31.4605737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:31.4606370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:31.4607010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:31.4607671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:31.4608346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:31.4609072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:31.4609553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:31.4610296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:31.4610772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:17:31.4611493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:17:31.4612218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:17:31.4613085Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:17:31.4613635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:17:31.4614388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:17:31.4615847Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:31.4616896Z warnings.warn( 2022-11-23T03:17:31.4618332Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:31.4619675Z warnings.warn( 2022-11-23T03:17:31.4619948Z dist init r=0, world=2 2022-11-23T03:17:31.4620332Z dist init r=1, world=2 2022-11-23T03:17:31.4620757Z ok (9.634s) 2022-11-23T03:17:31.4621239Z test_fsdp_memory_ckpt_no_ckpt (__main__.TestFSDPMemory) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42819 2022-11-23T03:17:31.4621811Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42820 2022-11-23T03:17:31.4622632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:31.4623155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:31.4624284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:31.4625053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:31.4625657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:17:31.4626180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:17:31.4626859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:17:31.4627354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:17:31.4627996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:17:31.4628509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:17:31.4629403Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:17:31.4630242Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:17:31.4630828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:17:31.4631367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:17:31.4633008Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:31.4633846Z warnings.warn( 2022-11-23T03:17:31.4635203Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:17:31.4636197Z warnings.warn( 2022-11-23T03:17:31.4636468Z dist init r=1, world=2 2022-11-23T03:17:31.4636897Z dist init r=0, world=2 2022-11-23T03:17:31.4637161Z ok (7.423s) 2022-11-23T03:17:31.4637316Z 2022-11-23T03:17:31.4637600Z ---------------------------------------------------------------------- 2022-11-23T03:17:31.4638008Z Ran 2 tests in 17.057s 2022-11-23T03:17:31.4638317Z 2022-11-23T03:17:31.4638399Z OK 2022-11-23T03:17:31.4638540Z 2022-11-23T03:17:31.4638674Z Generating XML reports... 2022-11-23T03:17:31.4639325Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20221123031713.xml 2022-11-23T03:17:31.4639839Z 2022-11-23T03:17:31.4640154Z ##[endgroup] 2022-11-23T03:17:31.4640975Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_memory (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_memory_h6zk11qn) 2022-11-23T03:17:31.4641564Z 2022-11-23T03:17:31.7928377Z 2022-11-23T03:17:31.7928883Z real 0m24.989s 2022-11-23T03:17:31.7929211Z user 0m41.663s 2022-11-23T03:17:31.7929469Z sys 0m32.421s 2022-11-23T03:17:31.7929742Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:17:31.7930460Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_meta.py 2022-11-23T03:17:34.2191978Z Ignoring disabled issues: [] 2022-11-23T03:17:34.2729424Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:17:34.2729974Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:17:34.2730326Z Selected tests: 2022-11-23T03:17:34.2730610Z distributed/fsdp/test_fsdp_meta.py 2022-11-23T03:17:34.2756298Z Prioritized test from test file changes. 2022-11-23T03:17:34.2756629Z reordering tests for PR: 2022-11-23T03:17:34.2756895Z prioritized: [] 2022-11-23T03:17:34.2757404Z the rest: ['distributed/fsdp/test_fsdp_meta.py'] 2022-11-23T03:17:34.2757614Z 2022-11-23T03:17:34.2758126Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:17:34.2759066Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:17:34.2765838Z parallel (file granularity) tests: 2022-11-23T03:17:34.2766134Z 2022-11-23T03:17:34.2766427Z serial (file granularity) tests: 2022-11-23T03:17:34.2766726Z distributed/fsdp/test_fsdp_meta.py 2022-11-23T03:17:36.6080685Z Ignoring disabled issues: [] 2022-11-23T03:17:36.9665229Z Running distributed/fsdp/test_fsdp_meta.py ... [2022-11-23 03:17:36.965886] 2022-11-23T03:17:36.9666533Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_meta.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:17:36.966450] 2022-11-23T03:18:12.9437744Z 2022-11-23T03:18:12.9438708Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_meta 2022-11-23T03:18:12.9439676Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_meta (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_meta_t6bxst_0) 2022-11-23T03:18:12.9445426Z 2022-11-23T03:18:12.9446290Z Running tests... 2022-11-23T03:18:12.9446869Z ---------------------------------------------------------------------- 2022-11-23T03:18:12.9447456Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_meta 2022-11-23T03:18:12.9448041Z test_bad_arg_meta (__main__.TestFSDPWithMetaDevice) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:18:12.9448477Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43244 2022-11-23T03:18:12.9448975Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43245 2022-11-23T03:18:12.9449632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9450099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9450674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9451162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9451754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9452194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9452786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9453340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9454298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9454891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9455489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9456253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9456727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9457279Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9457694Z dist init r=1, world=2 2022-11-23T03:18:12.9457956Z dist init r=0, world=2 2022-11-23T03:18:12.9458241Z ok (5.782s) 2022-11-23T03:18:12.9458635Z test_bad_arg_torchdistx (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-11-23T03:18:12.9459389Z test_nested_model_with_meta_device_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43387 2022-11-23T03:18:12.9460032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43388 2022-11-23T03:18:12.9460596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9461082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9461674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9462149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9462716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9463189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9463788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9464663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9465075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9465558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9466279Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9466899Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9467419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9467913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9468337Z dist init r=1, world=2 2022-11-23T03:18:12.9468511Z dist init r=0, world=2 2022-11-23T03:18:12.9468758Z ok (4.615s) 2022-11-23T03:18:12.9469259Z test_nested_model_with_meta_device_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43538 2022-11-23T03:18:12.9469867Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43539 2022-11-23T03:18:12.9470443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9470909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9471481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9472048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9472793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9473192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9473793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9474247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9474761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9475264Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9475929Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9476634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9477143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9477621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9477988Z dist init r=1, world=2 2022-11-23T03:18:12.9478229Z dist init r=0, world=2 2022-11-23T03:18:12.9478478Z ok (4.615s) 2022-11-23T03:18:12.9478981Z test_nested_model_with_meta_device_reset_params_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43689 2022-11-23T03:18:12.9479570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43690 2022-11-23T03:18:12.9480165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9480627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9481211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9481674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9482249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9482705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9483283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9483733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9484189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9484686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9485329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9486026Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9486563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9487040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9487383Z dist init r=1, world=2 2022-11-23T03:18:12.9487648Z dist init r=0, world=2 2022-11-23T03:18:12.9487901Z ok (4.615s) 2022-11-23T03:18:12.9488375Z test_nested_model_with_meta_device_reset_params_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43840 2022-11-23T03:18:12.9489024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43841 2022-11-23T03:18:12.9489772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9490235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9490794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9491275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9491860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9492314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9492864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9493320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9493786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9494285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9494944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9495611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9496136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9496616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9496978Z dist init r=0, world=2 2022-11-23T03:18:12.9497214Z dist init r=1, world=2 2022-11-23T03:18:12.9497457Z ok (4.615s) 2022-11-23T03:18:12.9497966Z test_nested_model_with_torchdistX_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-11-23T03:18:12.9498676Z test_nested_model_with_torchdistX_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-11-23T03:18:12.9499395Z test_nested_model_with_torchdistX_init_fn_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-11-23T03:18:12.9500103Z test_nested_model_with_torchdistX_init_fn_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-11-23T03:18:12.9500797Z test_simple_model_with_meta_device_default_init (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43991 2022-11-23T03:18:12.9501360Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43992 2022-11-23T03:18:12.9501974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9502434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9503010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9503515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9504342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9504735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9505318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9505764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9506322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9506882Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9507533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9508223Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9508751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9509232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9509569Z dist init r=1, world=2 2022-11-23T03:18:12.9509828Z dist init r=0, world=2 2022-11-23T03:18:12.9510072Z ok (4.615s) 2022-11-23T03:18:12.9510525Z test_simple_model_with_meta_device_reset_params (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44142 2022-11-23T03:18:12.9511105Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44143 2022-11-23T03:18:12.9511722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9512176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9512730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9513202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9513788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:18:12.9514236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:18:12.9514786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:18:12.9515264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:18:12.9515723Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:18:12.9516203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:18:12.9516868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9517557Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:18:12.9518079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:18:12.9518576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:18:12.9518889Z dist init r=1, world=2 2022-11-23T03:18:12.9519151Z dist init r=0, world=2 2022-11-23T03:18:12.9519455Z ok (4.615s) 2022-11-23T03:18:12.9519860Z test_simple_model_with_torchdistX_default_init (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-11-23T03:18:12.9520550Z test_simple_model_with_torchdistX_init_fn (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-11-23T03:18:12.9520909Z 2022-11-23T03:18:12.9521189Z ---------------------------------------------------------------------- 2022-11-23T03:18:12.9521508Z Ran 14 tests in 33.478s 2022-11-23T03:18:12.9521673Z 2022-11-23T03:18:12.9521785Z OK (skipped=7) 2022-11-23T03:18:12.9521940Z 2022-11-23T03:18:12.9522067Z Generating XML reports... 2022-11-23T03:18:12.9522662Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20221123031739.xml 2022-11-23T03:18:12.9523092Z 2022-11-23T03:18:12.9523570Z ##[endgroup] 2022-11-23T03:18:12.9524275Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_meta (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_meta_t6bxst_0) 2022-11-23T03:18:12.9524585Z 2022-11-23T03:18:13.3238516Z 2022-11-23T03:18:13.3239114Z real 0m41.531s 2022-11-23T03:18:13.3239627Z user 1m17.473s 2022-11-23T03:18:13.3240043Z sys 1m6.433s 2022-11-23T03:18:13.3240493Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:18:13.3241507Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_misc.py 2022-11-23T03:18:15.6903668Z Ignoring disabled issues: [] 2022-11-23T03:18:15.7431253Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:18:15.7431846Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:18:15.7432212Z Selected tests: 2022-11-23T03:18:15.7432483Z distributed/fsdp/test_fsdp_misc.py 2022-11-23T03:18:15.7459581Z Prioritized test from test file changes. 2022-11-23T03:18:15.7459932Z reordering tests for PR: 2022-11-23T03:18:15.7460211Z prioritized: [] 2022-11-23T03:18:15.7460752Z the rest: ['distributed/fsdp/test_fsdp_misc.py'] 2022-11-23T03:18:15.7460971Z 2022-11-23T03:18:15.7461522Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:18:15.7462536Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:18:15.7466526Z parallel (file granularity) tests: 2022-11-23T03:18:15.7466810Z 2022-11-23T03:18:15.7467858Z serial (file granularity) tests: 2022-11-23T03:18:15.7468505Z distributed/fsdp/test_fsdp_misc.py 2022-11-23T03:18:18.0522301Z Ignoring disabled issues: [] 2022-11-23T03:18:18.4566790Z Running distributed/fsdp/test_fsdp_misc.py ... [2022-11-23 03:18:18.456257] 2022-11-23T03:18:18.4568182Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_misc.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:18:18.456657] 2022-11-23T03:19:30.9618593Z 2022-11-23T03:19:30.9619228Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_misc 2022-11-23T03:19:30.9622261Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_misc (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_misc_n21uonwg) 2022-11-23T03:19:30.9622691Z 2022-11-23T03:19:30.9622826Z Running tests... 2022-11-23T03:19:30.9623384Z ---------------------------------------------------------------------- 2022-11-23T03:19:30.9624257Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_misc 2022-11-23T03:19:30.9624709Z test_cpu_init_with_sync_module_states (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9626639Z Tests that passing ``sync_module_states=True`` raises an error for ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:19:30.9627212Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44505 2022-11-23T03:19:30.9627638Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44506 2022-11-23T03:19:30.9628219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9628758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9629801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9630299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9630865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9631356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9632236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9632840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9633230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9633826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9634463Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9635201Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9635710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9636193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9637491Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:19:30.9638422Z warnings.warn( 2022-11-23T03:19:30.9639586Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:19:30.9640366Z warnings.warn( 2022-11-23T03:19:30.9640634Z dist init r=0, world=2 2022-11-23T03:19:30.9640879Z dist init r=1, world=2 2022-11-23T03:19:30.9641128Z ok (5.859s) 2022-11-23T03:19:30.9641419Z test_device_id_auto_wrap (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9641883Z Tests that ``auto_wrap_policy`` propagates ``device_id`` to all ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44648 2022-11-23T03:19:30.9642335Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44649 2022-11-23T03:19:30.9642955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9643412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9643975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9644467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9645105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9645554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9646109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9646576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9647030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9647510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9648173Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9648859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9649575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9650036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9650381Z dist init r=0, world=2 2022-11-23T03:19:30.9650628Z dist init r=1, world=2 2022-11-23T03:19:30.9650846Z ok (4.113s) 2022-11-23T03:19:30.9651144Z test_fsdp_cpu_init_stays_on_cpu (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9651649Z Tests that passing a CPU module to FSDP preserves that the wrapped ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44791 2022-11-23T03:19:30.9652178Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44792 2022-11-23T03:19:30.9652770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9653223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9653797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9654242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9654815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9655379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9655947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9656396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9656856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9657354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9658026Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9658701Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9659220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9659691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9660025Z dist init r=0, world=2 2022-11-23T03:19:30.9660272Z dist init r=1, world=2 2022-11-23T03:19:30.9660507Z ok (4.515s) 2022-11-23T03:19:30.9660790Z test_fsdp_device_id_cpu_offload (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9661278Z Ensures that even if device_id is specified but we have ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44942 2022-11-23T03:19:30.9661796Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44943 2022-11-23T03:19:30.9662412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9662844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9663410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9664121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9664807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9665142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9665708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9666169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9666714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9667266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9667925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9668603Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9669105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9669581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9669938Z dist init r=1, world=2 2022-11-23T03:19:30.9670169Z dist init r=0, world=2 2022-11-23T03:19:30.9670407Z ok (4.114s) 2022-11-23T03:19:30.9670711Z test_fsdp_device_id_use_index_False (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9671210Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45085 2022-11-23T03:19:30.9671732Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45086 2022-11-23T03:19:30.9672273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9672720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9673270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9673736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9674310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9674754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9675315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9675778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9676228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9676701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9677353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9678035Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9678549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9678997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9679345Z dist init r=0, world=2 2022-11-23T03:19:30.9679597Z dist init r=1, world=2 2022-11-23T03:19:30.9679820Z ok (4.015s) 2022-11-23T03:19:30.9680123Z test_fsdp_device_id_use_index_True (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9680586Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45228 2022-11-23T03:19:30.9681077Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45229 2022-11-23T03:19:30.9681664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9682107Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9682671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9683118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9683763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9684331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9684830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9685273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9685717Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9686207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9686863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9687532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9688061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9688537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9688873Z dist init r=1, world=2 2022-11-23T03:19:30.9689133Z dist init r=0, world=2 2022-11-23T03:19:30.9689378Z ok (4.014s) 2022-11-23T03:19:30.9689873Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45371 2022-11-23T03:19:30.9690436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45372 2022-11-23T03:19:30.9691051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9691503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9692062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9692544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9693133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9693644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9694137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9694607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9695060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9695533Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9696195Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9696896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9697424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9697878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9698238Z dist init r=0, world=2 2022-11-23T03:19:30.9698493Z dist init r=1, world=2 2022-11-23T03:19:30.9698716Z ok (4.516s) 2022-11-23T03:19:30.9699243Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45522 2022-11-23T03:19:30.9699857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45523 2022-11-23T03:19:30.9700468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9701007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9701593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9702072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9702650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9703075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9703643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9704368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9704895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9705384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9706061Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9706748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9707274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9707660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9708021Z dist init r=1, world=2 2022-11-23T03:19:30.9708277Z dist init r=0, world=2 2022-11-23T03:19:30.9708502Z ok (4.616s) 2022-11-23T03:19:30.9708990Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45673 2022-11-23T03:19:30.9709575Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45674 2022-11-23T03:19:30.9710171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9710623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9711199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9711736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9712292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9712743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9713316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9713766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9714227Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9714719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9715374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9716043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9716570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9717047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9717410Z dist init r=0, world=2 2022-11-23T03:19:30.9717647Z dist init r=1, world=2 2022-11-23T03:19:30.9717994Z ok (4.615s) 2022-11-23T03:19:30.9718629Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45824 2022-11-23T03:19:30.9719226Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45825 2022-11-23T03:19:30.9719841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9720303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9720880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9721338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9721924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9722377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9722934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9723407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9723868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9724455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9725070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9725689Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9726272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9726705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9727047Z dist init r=1, world=2 2022-11-23T03:19:30.9727308Z dist init r=0, world=2 2022-11-23T03:19:30.9727555Z ok (4.616s) 2022-11-23T03:19:30.9727953Z test_fsdp_namedtuple (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45975 2022-11-23T03:19:30.9728469Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45976 2022-11-23T03:19:30.9729084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9729517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9730096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9730566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9731146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9731579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9732245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9732629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9733065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9733568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9734226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9734913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9735502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9736086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9736463Z dist init r=0, world=2 2022-11-23T03:19:30.9736648Z dist init r=1, world=2 2022-11-23T03:19:30.9736871Z ok (4.014s) 2022-11-23T03:19:30.9737367Z test_fsdp_not_all_outputs_used_in_loss (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46118 2022-11-23T03:19:30.9737940Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46119 2022-11-23T03:19:30.9738435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9738893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9739473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9739947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9740595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9740958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9741626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9742007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9742434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9743016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9743584Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9744475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9745210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9745671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9746600Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:19:30.9747152Z warnings.warn(message, UserWarning) 2022-11-23T03:19:30.9747969Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:19:30.9748481Z warnings.warn(message, UserWarning) 2022-11-23T03:19:30.9748771Z dist init r=1, world=2 2022-11-23T03:19:30.9749128Z dist init r=0, world=2 2022-11-23T03:19:30.9749350Z ok (4.719s) 2022-11-23T03:19:30.9749625Z test_fsdp_same_model_across_ranks (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9750079Z FSDP broadcasts model from rank 0 to ensure it starts off with the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46269 2022-11-23T03:19:30.9750666Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46270 2022-11-23T03:19:30.9751212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9751751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9752243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9752824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9753468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9753926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9754478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9754955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9755412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9755910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9756544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9757235Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9757771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9758247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9758585Z dist init r=1, world=2 2022-11-23T03:19:30.9758851Z dist init r=0, world=2 2022-11-23T03:19:30.9759098Z ok (4.115s) 2022-11-23T03:19:30.9759393Z test_module_device_mismatches_device_id (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9759903Z Tests that specifying a ``device_id`` argument to FSDP for a GPU ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46412 2022-11-23T03:19:30.9760438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46413 2022-11-23T03:19:30.9761026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9761487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9762120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9762595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9763150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9763604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9764207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9764626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9765082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9765582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9766250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9766921Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9767444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9767926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9768289Z dist init r=0, world=2 2022-11-23T03:19:30.9768524Z dist init r=1, world=2 2022-11-23T03:19:30.9768767Z ok (4.015s) 2022-11-23T03:19:30.9769072Z test_multi_device_not_supported (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9769693Z Tests that wrapping a multi-device module (i.e. with submodules on ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46555 2022-11-23T03:19:30.9770308Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46556 2022-11-23T03:19:30.9771058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9771429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9771991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9772462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9773043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9773467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9774052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9774523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9774990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9775470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9776130Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9776821Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9777353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9777807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9778157Z dist init r=1, world=2 2022-11-23T03:19:30.9778415Z dist init r=0, world=2 2022-11-23T03:19:30.9778641Z ok (4.115s) 2022-11-23T03:19:30.9778923Z test_no_params (__main__.TestFSDPMisc) 2022-11-23T03:19:30.9779434Z Test that device_id and cpu init work if module has no params ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46698 2022-11-23T03:19:30.9779914Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46699 2022-11-23T03:19:30.9780525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9780976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9781557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9782011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9782690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:19:30.9783040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:19:30.9783607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:19:30.9784456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:19:30.9784907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:19:30.9785410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:19:30.9786041Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9786710Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:19:30.9787249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:19:30.9787651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:19:30.9788091Z dist init r=1, world=2 2022-11-23T03:19:30.9788406Z dist init r=0, world=2 2022-11-23T03:19:30.9788659Z ok (4.115s) 2022-11-23T03:19:30.9788788Z 2022-11-23T03:19:30.9789065Z ---------------------------------------------------------------------- 2022-11-23T03:19:30.9789400Z Ran 16 tests in 70.089s 2022-11-23T03:19:30.9789567Z 2022-11-23T03:19:30.9789663Z OK 2022-11-23T03:19:30.9789797Z 2022-11-23T03:19:30.9789902Z Generating XML reports... 2022-11-23T03:19:30.9790478Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20221123031820.xml 2022-11-23T03:19:30.9790812Z 2022-11-23T03:19:30.9791273Z ##[endgroup] 2022-11-23T03:19:30.9791838Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_misc (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_misc_n21uonwg) 2022-11-23T03:19:30.9792189Z 2022-11-23T03:19:31.3166030Z 2022-11-23T03:19:31.3166901Z real 1m17.993s 2022-11-23T03:19:31.3167242Z user 2m33.782s 2022-11-23T03:19:31.3167490Z sys 2m7.297s 2022-11-23T03:19:31.3167708Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:19:31.3168359Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_mixed_precision.py 2022-11-23T03:19:33.6840613Z Ignoring disabled issues: [] 2022-11-23T03:19:33.7379213Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:19:33.7379780Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:19:33.7380137Z Selected tests: 2022-11-23T03:19:33.7380447Z distributed/fsdp/test_fsdp_mixed_precision.py 2022-11-23T03:19:33.7406177Z Prioritized test from test file changes. 2022-11-23T03:19:33.7406651Z reordering tests for PR: 2022-11-23T03:19:33.7406937Z prioritized: [] 2022-11-23T03:19:33.7407468Z the rest: ['distributed/fsdp/test_fsdp_mixed_precision.py'] 2022-11-23T03:19:33.7407702Z 2022-11-23T03:19:33.7408247Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:19:33.7409190Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:19:33.7414467Z parallel (file granularity) tests: 2022-11-23T03:19:33.7415515Z 2022-11-23T03:19:33.7415907Z serial (file granularity) tests: 2022-11-23T03:19:33.7416218Z distributed/fsdp/test_fsdp_mixed_precision.py 2022-11-23T03:19:36.0423085Z Ignoring disabled issues: [] 2022-11-23T03:19:36.4618067Z Running distributed/fsdp/test_fsdp_mixed_precision.py ... [2022-11-23 03:19:36.461110] 2022-11-23T03:19:36.4618889Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:19:36.461569] 2022-11-23T03:25:08.7351962Z 2022-11-23T03:25:08.7352769Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_mixed_precision 2022-11-23T03:25:08.7356554Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_mixed_precision (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_mixed_precision_7133gchv) 2022-11-23T03:25:08.7356965Z 2022-11-23T03:25:08.7357079Z Running tests... 2022-11-23T03:25:08.7357623Z ---------------------------------------------------------------------- 2022-11-23T03:25:08.7358186Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision 2022-11-23T03:25:08.7359234Z test_grads_reduced_precision (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47053 2022-11-23T03:25:08.7359802Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47054 2022-11-23T03:25:08.7360444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7361325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7362161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7362633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7363224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7363699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7364268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7364732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7365169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7365681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7366340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7367033Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7367536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7368020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7368380Z dist init r=0, world=2 2022-11-23T03:25:08.7368630Z dist init r=1, world=2 2022-11-23T03:25:08.7368850Z ok (6.236s) 2022-11-23T03:25:08.7369411Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47205 2022-11-23T03:25:08.7370164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47206 2022-11-23T03:25:08.7370779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7371212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7371812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7372278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7372883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7373313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7373893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7374370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7374830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7375315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7375971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7376664Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7377163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7377638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7378598Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7379241Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7380066Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7380623Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7381548Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7382791Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7384437Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7386114Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7387358Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7390063Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7391389Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7392610Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7393831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7395090Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7395716Z dist init r=0, world=2 2022-11-23T03:25:08.7396066Z dist init r=1, world=2 2022-11-23T03:25:08.7396303Z ok (6.622s) 2022-11-23T03:25:08.7397417Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47357 2022-11-23T03:25:08.7398046Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47358 2022-11-23T03:25:08.7398652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7399110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7399688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7400142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7400721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7401178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7401754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7402203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7402656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7403158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7403811Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7404482Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7405007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7405480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7406361Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7406925Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7407799Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7408385Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7409306Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7410546Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7411753Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7412983Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7414265Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7415479Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7416686Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7438143Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7438884Z dist init r=1, world=2 2022-11-23T03:25:08.7439154Z dist init r=0, world=2 2022-11-23T03:25:08.7439403Z ok (6.521s) 2022-11-23T03:25:08.7440013Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47509 2022-11-23T03:25:08.7440712Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47510 2022-11-23T03:25:08.7441412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7441880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7442492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7443004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7443766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7444279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7444892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7445398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7445865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7446398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7447095Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7447829Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7448368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7448878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7449809Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7450530Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7451362Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7452088Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7453026Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7454256Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7455489Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7456682Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7457915Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7459144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7460413Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7461624Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7462846Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7464329Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7464945Z dist init r=0, world=2 2022-11-23T03:25:08.7465179Z dist init r=1, world=2 2022-11-23T03:25:08.7465422Z ok (6.621s) 2022-11-23T03:25:08.7465957Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47661 2022-11-23T03:25:08.7466653Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47662 2022-11-23T03:25:08.7467287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7467756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7468342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7468802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7469381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7469832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7470384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7470860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7471315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7471819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7472456Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7473147Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7473672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7474140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7475001Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7475604Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7476413Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7477061Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7477969Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7479211Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7480433Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7481663Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7482947Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7484182Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7485413Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7486641Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7487226Z dist init r=1, world=2 2022-11-23T03:25:08.7487482Z dist init r=0, world=2 2022-11-23T03:25:08.7487723Z ok (6.622s) 2022-11-23T03:25:08.7488273Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47813 2022-11-23T03:25:08.7488935Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47814 2022-11-23T03:25:08.7489551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7490005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7490567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7491048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7491636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7492080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7492685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7493154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7493607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7494087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7494753Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7495442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7495970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7496422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7497299Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7497895Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7498705Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7499314Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7500236Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7501468Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7502701Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7505005Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7506510Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7507285Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7508399Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7509163Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7509526Z dist init r=1, world=2 2022-11-23T03:25:08.7509787Z dist init r=0, world=2 2022-11-23T03:25:08.7510008Z ok (6.621s) 2022-11-23T03:25:08.7510542Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47965 2022-11-23T03:25:08.7511159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47966 2022-11-23T03:25:08.7511779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7512212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7512788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7513259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7513825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7514278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7514848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7515310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7515819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7516321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7516978Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7517647Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7518177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7518654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7519532Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7520100Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7520908Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7521476Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7522398Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7523623Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7524866Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7526116Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7527359Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7528574Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7529792Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7530994Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7532264Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7533476Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7534703Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7535295Z dist init r=0, world=2 2022-11-23T03:25:08.7535559Z dist init r=1, world=2 2022-11-23T03:25:08.7535808Z ok (6.521s) 2022-11-23T03:25:08.7536354Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48117 2022-11-23T03:25:08.7537013Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48118 2022-11-23T03:25:08.7537628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7538078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7538632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7539107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7539684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7540128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7540679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7541158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7541667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7542154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7542820Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7543509Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7544303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7544771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7545661Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7546250Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7547050Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7547684Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7548604Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7549833Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7551063Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7552379Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7553845Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7554619Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7555731Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7556495Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7556765Z dist init r=1, world=2 2022-11-23T03:25:08.7557018Z dist init r=0, world=2 2022-11-23T03:25:08.7557239Z ok (6.623s) 2022-11-23T03:25:08.7557844Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48269 2022-11-23T03:25:08.7558472Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48270 2022-11-23T03:25:08.7559087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7559522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7560103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7560574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7561134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7561579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7562154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7562622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7563065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7563577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7564298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7564986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7565488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7565966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7566851Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7567428Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7568215Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7568792Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7569714Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7570930Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7572152Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7573409Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7574650Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7575865Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7577091Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7578295Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7579502Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7580791Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7582034Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7582619Z dist init r=0, world=2 2022-11-23T03:25:08.7582880Z dist init r=1, world=2 2022-11-23T03:25:08.7583121Z ok (6.621s) 2022-11-23T03:25:08.7583668Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48421 2022-11-23T03:25:08.7584542Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48422 2022-11-23T03:25:08.7585161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7585619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7586178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7586654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7587243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7587708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7588263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7588751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7589218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7589810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7590469Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7591348Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7591883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7592356Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7593246Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7593848Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7594660Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7595247Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7596144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7597464Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7598695Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7599909Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7601138Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7602360Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7603586Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7604790Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7606062Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7607290Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7607896Z dist init r=0, world=2 2022-11-23T03:25:08.7608153Z dist init r=1, world=2 2022-11-23T03:25:08.7608378Z ok (6.622s) 2022-11-23T03:25:08.7608896Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48573 2022-11-23T03:25:08.7609502Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48574 2022-11-23T03:25:08.7610108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7610531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7611091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7611617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7612177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7612611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7613172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7613636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7614070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7614569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7615220Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7615908Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7616411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7616882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7617755Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7618336Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7619114Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7619691Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7620603Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7621894Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7623123Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7624619Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7625831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7627055Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7628355Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7629570Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7630166Z dist init r=1, world=2 2022-11-23T03:25:08.7630471Z dist init r=0, world=2 2022-11-23T03:25:08.7630694Z ok (6.620s) 2022-11-23T03:25:08.7631236Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48725 2022-11-23T03:25:08.7631938Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48726 2022-11-23T03:25:08.7632537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7632987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7633674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7634151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7634711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7635162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7635730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7636182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7636635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7637129Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7637784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7638532Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7639060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7639535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7640414Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7640983Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7641788Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7642370Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7643283Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7644514Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7645788Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7647007Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7648234Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7649449Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7650665Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7651875Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7653231Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7654451Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7655030Z dist init r=0, world=2 2022-11-23T03:25:08.7655290Z dist init r=1, world=2 2022-11-23T03:25:08.7655602Z ok (6.521s) 2022-11-23T03:25:08.7656095Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48877 2022-11-23T03:25:08.7656696Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48878 2022-11-23T03:25:08.7657309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7657761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7658318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7658790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7659365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7659867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7660419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7660883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7661338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7661825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7662489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7663174Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7663696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7664422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7665311Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7665896Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7666701Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7667279Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7668177Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7669416Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7670716Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7671956Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7673179Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7674403Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7675612Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7676881Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7677481Z dist init r=1, world=2 2022-11-23T03:25:08.7677736Z dist init r=0, world=2 2022-11-23T03:25:08.7677985Z ok (6.620s) 2022-11-23T03:25:08.7678508Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49029 2022-11-23T03:25:08.7679141Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49030 2022-11-23T03:25:08.7679747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7680199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7680756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7681307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7681890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7682316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7682890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7683351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7683807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7684292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7684944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7685632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7686239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7686697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7687578Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7688163Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7688972Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7689528Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7690445Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7691671Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7692961Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7694180Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7695643Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7696420Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7697536Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7698302Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7698579Z dist init r=1, world=2 2022-11-23T03:25:08.7698815Z dist init r=0, world=2 2022-11-23T03:25:08.7699055Z ok (6.621s) 2022-11-23T03:25:08.7699567Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49181 2022-11-23T03:25:08.7700166Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49182 2022-11-23T03:25:08.7700760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7701206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7701782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7702301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7702867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7703309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7704342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7704814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7705270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7705773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7706439Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7707113Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7707634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7708107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7708986Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7709657Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7710464Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7711044Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7711963Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7713190Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7714404Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7715631Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7716841Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7718118Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7719345Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7720563Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7721785Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7722996Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7724189Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7724847Z dist init r=1, world=2 2022-11-23T03:25:08.7725102Z dist init r=0, world=2 2022-11-23T03:25:08.7725346Z ok (6.520s) 2022-11-23T03:25:08.7725879Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49333 2022-11-23T03:25:08.7726507Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49334 2022-11-23T03:25:08.7727156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7727607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7728168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7728639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7729221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7729649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7730220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7730682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7731138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7731618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7732279Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7732971Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7733490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7733943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7734864Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7735447Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7736260Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7736824Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7737742Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7738972Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7740203Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7741477Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7742941Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7743804Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7745268Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.7746040Z return iter(self.unbind(0)) 2022-11-23T03:25:08.7746311Z dist init r=1, world=2 2022-11-23T03:25:08.7746568Z dist init r=0, world=2 2022-11-23T03:25:08.7746793Z ok (6.620s) 2022-11-23T03:25:08.7747355Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49485 2022-11-23T03:25:08.7747955Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49486 2022-11-23T03:25:08.7748557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7749009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7749582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7750056Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7750691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7751152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7751727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7752271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7752731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7753237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7753896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7754567Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7755091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7755560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7756444Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7757082Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7757898Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7758473Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7759395Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7760627Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7761843Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7763065Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7764288Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7765502Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7766771Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7768001Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7769219Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7770435Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7771624Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7772276Z dist init r=0, world=2 2022-11-23T03:25:08.7772530Z dist init r=1, world=2 2022-11-23T03:25:08.7772769Z ok (6.620s) 2022-11-23T03:25:08.7773298Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49637 2022-11-23T03:25:08.7773929Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49638 2022-11-23T03:25:08.7774549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7774999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7775559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7776029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7776610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7777033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7777604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7778070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7778529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7779009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7779665Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7780351Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7780881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7781334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7782387Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7783629Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7785210Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7786430Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7787003Z dist init r=0, world=2 2022-11-23T03:25:08.7787248Z dist init r=1, world=2 2022-11-23T03:25:08.7787483Z ok (6.520s) 2022-11-23T03:25:08.7787975Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49789 2022-11-23T03:25:08.7788665Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49790 2022-11-23T03:25:08.7789275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7789719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7790272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7790741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7791318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7791758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7792307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7792766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7793218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7793692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7794347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7795033Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7795547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7795997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7796982Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7798215Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7799499Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7800737Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7801953Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7803172Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7804380Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7805633Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7806231Z dist init r=1, world=2 2022-11-23T03:25:08.7806479Z dist init r=0, world=2 2022-11-23T03:25:08.7806719Z ok (6.519s) 2022-11-23T03:25:08.7807242Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49941 2022-11-23T03:25:08.7807868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49942 2022-11-23T03:25:08.7808480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7808904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7809475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7809940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7810513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7810936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7811502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7811961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7812408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7812894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7813543Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7814225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7814773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7815248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7816229Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7817462Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7818685Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7819892Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7820550Z dist init r=1, world=2 2022-11-23T03:25:08.7820798Z dist init r=0, world=2 2022-11-23T03:25:08.7821034Z ok (6.621s) 2022-11-23T03:25:08.7821528Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50093 2022-11-23T03:25:08.7822125Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50094 2022-11-23T03:25:08.7822735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7823180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7823732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7824460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7825043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7825483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7826028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7826485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7826941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7827495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7828146Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7828830Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7829348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7829798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7830785Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7832094Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7833326Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7834551Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7835767Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7836986Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7838270Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7839490Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7840085Z dist init r=1, world=2 2022-11-23T03:25:08.7840336Z dist init r=0, world=2 2022-11-23T03:25:08.7840572Z ok (6.419s) 2022-11-23T03:25:08.7841091Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50245 2022-11-23T03:25:08.7841716Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50246 2022-11-23T03:25:08.7842320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7842750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7843319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7843780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7844347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7844774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7845336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7845791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7846237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7846766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7847426Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7848106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7848604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7849074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7850049Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7851273Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7852584Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7853870Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7854447Z dist init r=1, world=2 2022-11-23T03:25:08.7854694Z dist init r=0, world=2 2022-11-23T03:25:08.7854933Z ok (6.620s) 2022-11-23T03:25:08.7855424Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50397 2022-11-23T03:25:08.7856016Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50398 2022-11-23T03:25:08.7856627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7857081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7857632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7858093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7858668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7859111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7859657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7860114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7860562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7861047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7861693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7862374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7862890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7863384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7864630Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7865872Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7867091Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7868310Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7869606Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7870825Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7872005Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7873225Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7873820Z dist init r=0, world=2 2022-11-23T03:25:08.7874071Z dist init r=1, world=2 2022-11-23T03:25:08.7874307Z ok (6.521s) 2022-11-23T03:25:08.7874832Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50549 2022-11-23T03:25:08.7875453Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50550 2022-11-23T03:25:08.7876057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7876489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7877056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7877518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7878088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7878596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7879225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7879698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7880151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7880628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7881286Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7881967Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7882462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7882929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7883911Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7885132Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7886412Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7887622Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7888839Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7890033Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7890637Z dist init r=0, world=2 2022-11-23T03:25:08.7890959Z dist init r=1, world=2 2022-11-23T03:25:08.7891195Z ok (6.620s) 2022-11-23T03:25:08.7891687Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50701 2022-11-23T03:25:08.7892284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50702 2022-11-23T03:25:08.7892896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7893326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7893893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7894353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7894986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7895411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7895979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7896433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7896889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7897364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7898012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7898692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7899193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7899661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7900640Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7901929Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7903148Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7904631Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7905835Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7907054Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7908256Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7909477Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7910152Z dist init r=0, world=2 2022-11-23T03:25:08.7910403Z dist init r=1, world=2 2022-11-23T03:25:08.7910623Z ok (6.520s) 2022-11-23T03:25:08.7911253Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50853 2022-11-23T03:25:08.7911905Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50854 2022-11-23T03:25:08.7912516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7912946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7913514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7913972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7914527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7914968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7915526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7915976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7916404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7916975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7917622Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7918304Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7918801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7919275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7920150Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7920722Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7921511Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7922083Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7922998Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7924217Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7925453Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7926696Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7927922Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7929135Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7930353Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7931653Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7932949Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7934159Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7934751Z dist init r=0, world=2 2022-11-23T03:25:08.7934980Z dist init r=1, world=2 2022-11-23T03:25:08.7935214Z ok (6.520s) 2022-11-23T03:25:08.7935740Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51005 2022-11-23T03:25:08.7936338Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51006 2022-11-23T03:25:08.7936944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7937390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7937959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7938414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7938985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7939424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7939968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7940434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7940881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7941373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7942008Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7942802Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7943327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7943790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7944923Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7945517Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7946318Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7946882Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7947879Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7949103Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7950424Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7951648Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7952941Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7954170Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7955378Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7956589Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7957235Z dist init r=0, world=2 2022-11-23T03:25:08.7957484Z dist init r=1, world=2 2022-11-23T03:25:08.7957722Z ok (6.521s) 2022-11-23T03:25:08.7958329Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51157 2022-11-23T03:25:08.7958988Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51158 2022-11-23T03:25:08.7959600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7960049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7960604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7961065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7961641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7962078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7962620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7963081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7963531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7964008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7964660Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7965403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7965928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7966379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7967262Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7967845Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7968659Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7969227Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7970156Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7971393Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7972621Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7973833Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7975094Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7976306Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7977520Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7978739Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7979957Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7981219Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7981811Z dist init r=1, world=2 2022-11-23T03:25:08.7982065Z dist init r=0, world=2 2022-11-23T03:25:08.7982292Z ok (6.520s) 2022-11-23T03:25:08.7982826Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51309 2022-11-23T03:25:08.7983440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51310 2022-11-23T03:25:08.7984293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7984740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7985316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7985781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7986336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.7986782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.7987343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.7987802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.7988232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.7988733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.7989382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7990047Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.7990566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.7991111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.7992003Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7992584Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7993356Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.7993928Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.7994846Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7996079Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7997361Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7998687Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.7999906Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8001133Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8002341Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8003548Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8004143Z dist init r=1, world=2 2022-11-23T03:25:08.8004378Z dist init r=0, world=2 2022-11-23T03:25:08.8004611Z ok (6.621s) 2022-11-23T03:25:08.8005169Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51461 2022-11-23T03:25:08.8005807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51462 2022-11-23T03:25:08.8006454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8006910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8007484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8007950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8008519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8008957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8009522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8009968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8010428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8010919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8011569Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8012236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8012814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8013280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8014158Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8014718Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8015520Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8016093Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8017039Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8018242Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8019477Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8020689Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8022205Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.8022978Z return iter(self.unbind(0)) 2022-11-23T03:25:08.8024337Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.8025120Z return iter(self.unbind(0)) 2022-11-23T03:25:08.8025365Z dist init r=1, world=2 2022-11-23T03:25:08.8025613Z dist init r=0, world=2 2022-11-23T03:25:08.8025847Z ok (6.621s) 2022-11-23T03:25:08.8026357Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51613 2022-11-23T03:25:08.8026969Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51614 2022-11-23T03:25:08.8027574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8028077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8028720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8029186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8029756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8030197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8030745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8031203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8031648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8032124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8032780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8033458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8033973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8034423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8035293Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8035874Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8036672Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8037233Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8038154Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8039439Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8040688Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8041915Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8043150Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8044368Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8045651Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8046875Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8048095Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8049312Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8050524Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8051122Z dist init r=0, world=2 2022-11-23T03:25:08.8051371Z dist init r=1, world=2 2022-11-23T03:25:08.8051590Z ok (6.621s) 2022-11-23T03:25:08.8052222Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51765 2022-11-23T03:25:08.8052885Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51766 2022-11-23T03:25:08.8053481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8053936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8054567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8055053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8055623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8056077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8056647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8057093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8057549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8058054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8058719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8059391Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8059911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8060453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8061339Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8061907Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8062719Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8063297Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8064564Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8065809Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8067018Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8068234Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8069701Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.8070551Z return iter(self.unbind(0)) 2022-11-23T03:25:08.8071703Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:911: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T03:25:08.8072468Z return iter(self.unbind(0)) 2022-11-23T03:25:08.8072716Z dist init r=1, world=2 2022-11-23T03:25:08.8072976Z dist init r=0, world=2 2022-11-23T03:25:08.8073219Z ok (6.722s) 2022-11-23T03:25:08.8073729Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51917 2022-11-23T03:25:08.8074345Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51918 2022-11-23T03:25:08.8074958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8075415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8075974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8076523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8077107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8077559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8078115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8078587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8079046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8079527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8080189Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8080880Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8081409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8081862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8082743Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8083331Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8084137Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8084687Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8085614Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8086901Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8088153Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8089385Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8090606Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8091809Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8093100Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8094296Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8095519Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8096729Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8097945Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8098539Z dist init r=0, world=2 2022-11-23T03:25:08.8098786Z dist init r=1, world=2 2022-11-23T03:25:08.8099090Z ok (6.623s) 2022-11-23T03:25:08.8099641Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52069 2022-11-23T03:25:08.8100273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52070 2022-11-23T03:25:08.8100864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8101312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8101885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8102404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8102976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8103419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8104233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8104711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8105144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8105637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8106296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8106963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8107478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8107946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8108935Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8110256Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8111469Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8112683Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8113897Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8115097Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8116309Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8117535Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8118187Z dist init r=1, world=2 2022-11-23T03:25:08.8118306Z dist init r=0, world=2 2022-11-23T03:25:08.8118390Z ok (6.523s) 2022-11-23T03:25:08.8118785Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52221 2022-11-23T03:25:08.8119002Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52222 2022-11-23T03:25:08.8119384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8119559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8120231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8120423Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8120789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8120944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8121315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8121501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8121806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8122049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8122446Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8122837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8123067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8123291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8124030Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8124774Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8125502Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8126233Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8126940Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8127725Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8128459Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8129191Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8129305Z dist init r=1, world=2 2022-11-23T03:25:08.8129412Z dist init r=0, world=2 2022-11-23T03:25:08.8129510Z ok (6.521s) 2022-11-23T03:25:08.8129938Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52373 2022-11-23T03:25:08.8130156Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52374 2022-11-23T03:25:08.8130525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8130749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8131127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8131299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8131658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8131835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8132201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8132386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8132629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8132873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8133267Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8133657Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8133868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8134096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8134838Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8135567Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8136351Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8137084Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8137816Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8138541Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8139272Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8140059Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8140171Z dist init r=1, world=2 2022-11-23T03:25:08.8140278Z dist init r=0, world=2 2022-11-23T03:25:08.8140375Z ok (6.521s) 2022-11-23T03:25:08.8140772Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52525 2022-11-23T03:25:08.8140992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52526 2022-11-23T03:25:08.8141343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8141518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8141899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8142090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8142449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8142620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8142991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8143177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8143423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8143650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8144291Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8144705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8144934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8145160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8145969Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8146719Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8147443Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8148178Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8148898Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8149697Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8150421Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8151150Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8151268Z dist init r=1, world=2 2022-11-23T03:25:08.8151376Z dist init r=0, world=2 2022-11-23T03:25:08.8151457Z ok (6.522s) 2022-11-23T03:25:08.8151881Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52677 2022-11-23T03:25:08.8152102Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52678 2022-11-23T03:25:08.8152567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8152744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8153121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8153315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8153681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8153853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8154201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8154440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8154693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8154934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8155333Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8155731Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8155962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8156188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8156928Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8157655Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8158488Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8159216Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8159949Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8160672Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8160767Z dist init r=1, world=2 2022-11-23T03:25:08.8160876Z dist init r=0, world=2 2022-11-23T03:25:08.8160976Z ok (6.620s) 2022-11-23T03:25:08.8161369Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52829 2022-11-23T03:25:08.8161587Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52830 2022-11-23T03:25:08.8161960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8162139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8162515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8162688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8163049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8163291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8163676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8163864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8164106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8164352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8164749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8165140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8165354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8165582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8166323Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8167113Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8167834Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8168565Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8169284Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8170026Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8170750Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8171522Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8171637Z dist init r=0, world=2 2022-11-23T03:25:08.8171744Z dist init r=1, world=2 2022-11-23T03:25:08.8171842Z ok (6.521s) 2022-11-23T03:25:08.8172305Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52981 2022-11-23T03:25:08.8172530Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52982 2022-11-23T03:25:08.8172902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8173058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8173438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8173628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8173987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8174160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8174532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8174719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8174960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8175183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8175639Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8176030Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8176258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8176483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8177225Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8177958Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8178693Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8179416Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8180140Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8180859Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8180970Z dist init r=0, world=2 2022-11-23T03:25:08.8181125Z dist init r=1, world=2 2022-11-23T03:25:08.8181229Z ok (6.721s) 2022-11-23T03:25:08.8181606Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53133 2022-11-23T03:25:08.8181826Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53134 2022-11-23T03:25:08.8182199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8182372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8182748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8182937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8183301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8183472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8183843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8184283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8184534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8184854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8185259Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8185653Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8185883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8186110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8186852Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8187589Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8188310Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8189043Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8189764Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8190565Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8191366Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8192100Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:25:08.8192212Z dist init r=1, world=2 2022-11-23T03:25:08.8192301Z dist init r=0, world=2 2022-11-23T03:25:08.8192400Z ok (6.521s) 2022-11-23T03:25:08.8192761Z test_mixed_precision_no_reshard_after_forward (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53285 2022-11-23T03:25:08.8192978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53286 2022-11-23T03:25:08.8193348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8193574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8193952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8194142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8194490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8194660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8195038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8195226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8195469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8195710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8196107Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8196502Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8196731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8196939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8197575Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8197722Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8198349Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8198491Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8198602Z dist init r=1, world=2 2022-11-23T03:25:08.8198710Z dist init r=0, world=2 2022-11-23T03:25:08.8198809Z ok (6.122s) 2022-11-23T03:25:08.8199005Z test_mixed_precision_resnet (__main__.TestFSDPMixedPrecisionSharded) 2022-11-23T03:25:08.8199282Z End to end test to ensure mixed precision + auto_wrap works ... skip: no torchvision (0.001s) 2022-11-23T03:25:08.8199638Z test_mp_batchnorm_convert_sync_bn_False (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53437 2022-11-23T03:25:08.8199859Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53438 2022-11-23T03:25:08.8200228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8200404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8200778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8200966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8201322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8201479Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8201851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8202037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8202278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8202586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8202983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8203375Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8203603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8203832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8203926Z dist init r=0, world=2 2022-11-23T03:25:08.8204032Z dist init r=1, world=2 2022-11-23T03:25:08.8204130Z ok (6.522s) 2022-11-23T03:25:08.8204480Z test_mp_batchnorm_convert_sync_bn_True (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53589 2022-11-23T03:25:08.8204700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53590 2022-11-23T03:25:08.8205065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8205238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8205608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8205779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8206140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8206312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8206687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8206873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8207121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8207362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8207757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8208149Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8208424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8208660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8209279Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:25:08.8209394Z warnings.warn( 2022-11-23T03:25:08.8210012Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:25:08.8210122Z warnings.warn( 2022-11-23T03:25:08.8210231Z dist init r=0, world=2 2022-11-23T03:25:08.8210338Z dist init r=1, world=2 2022-11-23T03:25:08.8210419Z ok (6.121s) 2022-11-23T03:25:08.8210756Z test_mp_embedding_default (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53741 2022-11-23T03:25:08.8210976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53742 2022-11-23T03:25:08.8211341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8211566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8211941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8212129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8212488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8212660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8213015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8213205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8213445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8213685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8214083Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8214474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8214702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8214931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8215025Z dist init r=1, world=2 2022-11-23T03:25:08.8215134Z dist init r=0, world=2 2022-11-23T03:25:08.8215232Z ok (6.320s) 2022-11-23T03:25:08.8215583Z test_mp_embedding_only_params_and_bufs (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53893 2022-11-23T03:25:08.8215799Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53894 2022-11-23T03:25:08.8216174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8216348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8216723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8216893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8217304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8217490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8217861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8218046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8218289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8218528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8218923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8219315Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8219531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8219763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8219874Z dist init r=0, world=2 2022-11-23T03:25:08.8219981Z dist init r=1, world=2 2022-11-23T03:25:08.8220078Z ok (6.320s) 2022-11-23T03:25:08.8220431Z test_mp_embedding_params_and_reduce_diff (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54045 2022-11-23T03:25:08.8220701Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54046 2022-11-23T03:25:08.8221070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8221225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8221602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8221791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8222150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8222323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8222692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8222881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8223122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8223361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8223736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8224390Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8224623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8224842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8224957Z dist init r=1, world=2 2022-11-23T03:25:08.8225067Z dist init r=0, world=2 2022-11-23T03:25:08.8225167Z ok (6.119s) 2022-11-23T03:25:08.8225504Z test_mp_embedding_reduce (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54197 2022-11-23T03:25:08.8225706Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54198 2022-11-23T03:25:08.8226078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8226321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8226696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8226867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8227237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8227427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8227799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8227969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8228210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:08.8228449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8228845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8229237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:25:08.8229462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8229750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:08.8229861Z dist init r=0, world=2 2022-11-23T03:25:08.8229969Z dist init r=1, world=2 2022-11-23T03:25:08.8230051Z ok (6.521s) 2022-11-23T03:25:08.8230401Z test_grads_reduced_precision (__main__.TestFSDPMixedPrecisionUnsharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54349 2022-11-23T03:25:08.8230775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8230950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8231322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8231590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8231836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8232238Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:25:08.8232447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8232558Z dist init r=0, world=1 2022-11-23T03:25:08.8232657Z ok (5.015s) 2022-11-23T03:25:08.8233010Z test_mixed_precision_e2e_full_shard (__main__.TestFSDPMixedPrecisionUnsharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54425 2022-11-23T03:25:08.8233378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8233551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8233925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8234117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8234361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8234736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:25:08.8234964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8235649Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8235798Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8235907Z dist init r=0, world=1 2022-11-23T03:25:08.8236006Z ok (4.815s) 2022-11-23T03:25:08.8236371Z test_mixed_precision_no_reshard_after_forward (__main__.TestFSDPMixedPrecisionUnsharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54501 2022-11-23T03:25:08.8236744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:08.8236919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:08.8237276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:08.8237464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:08.8237706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:08.8238101Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:25:08.8238327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:08.8239010Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:25:08.8239152Z warnings.warn(message, UserWarning) 2022-11-23T03:25:08.8239261Z dist init r=0, world=1 2022-11-23T03:25:08.8239341Z ok (4.816s) 2022-11-23T03:25:08.8239386Z 2022-11-23T03:25:08.8239635Z ---------------------------------------------------------------------- 2022-11-23T03:25:08.8239752Z Ran 52 tests in 328.071s 2022-11-23T03:25:08.8239775Z 2022-11-23T03:25:08.8239880Z OK (skipped=1) 2022-11-23T03:25:08.8239900Z 2022-11-23T03:25:08.8240021Z Generating XML reports... 2022-11-23T03:25:08.8240528Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionSharded-20221123031940.xml 2022-11-23T03:25:08.8241031Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionUnsharded-20221123031940.xml 2022-11-23T03:25:08.8241057Z 2022-11-23T03:25:08.8241532Z ##[endgroup] 2022-11-23T03:25:08.8242016Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_mixed_precision (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_mixed_precision_7133gchv) 2022-11-23T03:25:08.8242054Z 2022-11-23T03:25:09.1381924Z 2022-11-23T03:25:09.1382461Z real 5m37.822s 2022-11-23T03:25:09.1382760Z user 10m36.426s 2022-11-23T03:25:09.1383008Z sys 6m51.245s 2022-11-23T03:25:09.1383323Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:25:09.1384268Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_multiple_forward.py 2022-11-23T03:25:11.5246487Z Ignoring disabled issues: [] 2022-11-23T03:25:11.5776056Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:25:11.5776652Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:25:11.5777230Z Selected tests: 2022-11-23T03:25:11.5777606Z distributed/fsdp/test_fsdp_multiple_forward.py 2022-11-23T03:25:11.5802586Z Prioritized test from test file changes. 2022-11-23T03:25:11.5803279Z reordering tests for PR: 2022-11-23T03:25:11.5803850Z prioritized: [] 2022-11-23T03:25:11.5804492Z the rest: ['distributed/fsdp/test_fsdp_multiple_forward.py'] 2022-11-23T03:25:11.5804725Z 2022-11-23T03:25:11.5805550Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:25:11.5806525Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:25:11.5810238Z parallel (file granularity) tests: 2022-11-23T03:25:11.5810826Z 2022-11-23T03:25:11.5811337Z serial (file granularity) tests: 2022-11-23T03:25:11.5811674Z distributed/fsdp/test_fsdp_multiple_forward.py 2022-11-23T03:25:13.9045450Z Ignoring disabled issues: [] 2022-11-23T03:25:14.2992374Z Running distributed/fsdp/test_fsdp_multiple_forward.py ... [2022-11-23 03:25:14.298365] 2022-11-23T03:25:14.2993185Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_forward.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:25:14.298885] 2022-11-23T03:25:23.5509948Z 2022-11-23T03:25:23.5510725Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_multiple_forward 2022-11-23T03:25:23.5511791Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_forward (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_multiple_forward_7z8yc3xr) 2022-11-23T03:25:23.5512220Z 2022-11-23T03:25:23.5512321Z Running tests... 2022-11-23T03:25:23.5512905Z ---------------------------------------------------------------------- 2022-11-23T03:25:23.5513856Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward 2022-11-23T03:25:23.5514434Z test_multi_forward (__main__.TestMultiForward) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:25:23.5514898Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54789 2022-11-23T03:25:23.5515353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54790 2022-11-23T03:25:23.5515812Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54791 2022-11-23T03:25:23.5516247Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54792 2022-11-23T03:25:23.5516909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:23.5517372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:23.5517880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:23.5518421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:23.5519001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:23.5519460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:23.5519946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:23.5520406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:23.5520995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:23.5521453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:23.5522002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:23.5522476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:23.5523060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:23.5523511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:23.5524137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:23.5524650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:23.5525209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:23.5525723Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:23.5526196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:25:23.5526692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:25:23.5527361Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:23.5528055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:23.5528719Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:23.5529413Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:23.5529950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:23.5530436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:25:23.5530890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:25:23.5531413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:23.5531898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:25:23.5532365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:25:23.5532847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:25:23.5533335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:25:23.5534622Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:23.5535414Z warnings.warn( 2022-11-23T03:25:23.5536546Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:23.5537329Z warnings.warn( 2022-11-23T03:25:23.5538476Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:23.5539256Z warnings.warn( 2022-11-23T03:25:23.5540470Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:23.5541238Z warnings.warn( 2022-11-23T03:25:23.5541474Z dist init r=3, world=4 2022-11-23T03:25:23.5541830Z dist init r=0, world=4 2022-11-23T03:25:23.5541991Z dist init r=1, world=4 2022-11-23T03:25:23.5542222Z dist init r=2, world=4 2022-11-23T03:25:23.5542467Z ok (6.749s) 2022-11-23T03:25:23.5542621Z 2022-11-23T03:25:23.5542898Z ---------------------------------------------------------------------- 2022-11-23T03:25:23.5543221Z Ran 1 test in 6.750s 2022-11-23T03:25:23.5543383Z 2022-11-23T03:25:23.5543484Z OK 2022-11-23T03:25:23.5543623Z 2022-11-23T03:25:23.5543754Z Generating XML reports... 2022-11-23T03:25:23.5544876Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20221123032516.xml 2022-11-23T03:25:23.5545183Z 2022-11-23T03:25:23.5545563Z ##[endgroup] 2022-11-23T03:25:23.5546228Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_forward (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_multiple_forward_7z8yc3xr) 2022-11-23T03:25:23.5546602Z 2022-11-23T03:25:23.8895830Z 2022-11-23T03:25:23.8896371Z real 0m14.751s 2022-11-23T03:25:23.8896739Z user 0m29.881s 2022-11-23T03:25:23.8896976Z sys 0m20.552s 2022-11-23T03:25:23.8897171Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:25:23.8897825Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_multiple_wrapping.py 2022-11-23T03:25:26.2329408Z Ignoring disabled issues: [] 2022-11-23T03:25:26.2860329Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:25:26.2860821Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:25:26.2861178Z Selected tests: 2022-11-23T03:25:26.2861474Z distributed/fsdp/test_fsdp_multiple_wrapping.py 2022-11-23T03:25:26.2890692Z Prioritized test from test file changes. 2022-11-23T03:25:26.2891215Z reordering tests for PR: 2022-11-23T03:25:26.2891517Z prioritized: [] 2022-11-23T03:25:26.2892037Z the rest: ['distributed/fsdp/test_fsdp_multiple_wrapping.py'] 2022-11-23T03:25:26.2892277Z 2022-11-23T03:25:26.2892724Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:25:26.2893689Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:25:26.2899778Z parallel (file granularity) tests: 2022-11-23T03:25:26.2900092Z 2022-11-23T03:25:26.2900430Z serial (file granularity) tests: 2022-11-23T03:25:26.2900767Z distributed/fsdp/test_fsdp_multiple_wrapping.py 2022-11-23T03:25:28.5673017Z Ignoring disabled issues: [] 2022-11-23T03:25:28.9408590Z Running distributed/fsdp/test_fsdp_multiple_wrapping.py ... [2022-11-23 03:25:28.940298] 2022-11-23T03:25:28.9411131Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_wrapping.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:25:28.940786] 2022-11-23T03:25:38.0382865Z 2022-11-23T03:25:38.0383931Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_multiple_wrapping 2022-11-23T03:25:38.0385412Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_multiple_wrapping_92r0xvys) 2022-11-23T03:25:38.0385878Z 2022-11-23T03:25:38.0385994Z Running tests... 2022-11-23T03:25:38.0386570Z ---------------------------------------------------------------------- 2022-11-23T03:25:38.0387149Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_wrapping 2022-11-23T03:25:38.0387632Z test_multiple_wrapping (__main__.TestMultipleWrapping) 2022-11-23T03:25:38.0388395Z This test simulates wrapping the module after training to run inference. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:25:38.0388938Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55302 2022-11-23T03:25:38.0389373Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55303 2022-11-23T03:25:38.0389811Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55304 2022-11-23T03:25:38.0390294Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55305 2022-11-23T03:25:38.0390911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:38.0391387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:38.0391973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:38.0392398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:38.0392991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:38.0393428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:38.0394004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:38.0394476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:38.0395173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:38.0395627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:38.0396204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:38.0396674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:38.0397238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:25:38.0397692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:25:38.0398267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:25:38.0398716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:25:38.0399185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:25:38.0399692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:25:38.0400186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:25:38.0400658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:25:38.0401325Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:38.0402021Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:38.0402709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:38.0403371Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:25:38.0403908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:25:38.0404386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:25:38.0404856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:25:38.0405302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:25:38.0406631Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:38.0407443Z warnings.warn( 2022-11-23T03:25:38.0408593Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:38.0409368Z warnings.warn( 2022-11-23T03:25:38.0410506Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:38.0411331Z warnings.warn( 2022-11-23T03:25:38.0412531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:25:38.0413298Z warnings.warn( 2022-11-23T03:25:38.0413555Z dist init r=2, world=4 2022-11-23T03:25:38.0413790Z dist init r=0, world=4 2022-11-23T03:25:38.0414046Z dist init r=3, world=4 2022-11-23T03:25:38.0414298Z dist init r=1, world=4 2022-11-23T03:25:38.0414520Z ok (6.669s) 2022-11-23T03:25:38.0414670Z 2022-11-23T03:25:38.0414949Z ---------------------------------------------------------------------- 2022-11-23T03:25:38.0415292Z Ran 1 test in 6.670s 2022-11-23T03:25:38.0415459Z 2022-11-23T03:25:38.0415534Z OK 2022-11-23T03:25:38.0415672Z 2022-11-23T03:25:38.0415800Z Generating XML reports... 2022-11-23T03:25:38.0416439Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_wrapping/TEST-TestMultipleWrapping-20221123032530.xml 2022-11-23T03:25:38.0416824Z 2022-11-23T03:25:38.0417130Z ##[endgroup] 2022-11-23T03:25:38.0417781Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_multiple_wrapping_92r0xvys) 2022-11-23T03:25:38.0418175Z 2022-11-23T03:25:38.4081047Z 2022-11-23T03:25:38.4102350Z real 0m14.518s 2022-11-23T03:25:38.4102745Z user 0m30.854s 2022-11-23T03:25:38.4105738Z sys 0m24.020s 2022-11-23T03:25:38.4106073Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:25:38.4106710Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_optim_state.py 2022-11-23T03:25:40.8639799Z Ignoring disabled issues: [] 2022-11-23T03:25:40.9171137Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:25:40.9171713Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:25:40.9172074Z Selected tests: 2022-11-23T03:25:40.9172360Z distributed/fsdp/test_fsdp_optim_state.py 2022-11-23T03:25:40.9196120Z Prioritized test from test file changes. 2022-11-23T03:25:40.9196617Z reordering tests for PR: 2022-11-23T03:25:40.9197681Z prioritized: [] 2022-11-23T03:25:40.9198205Z the rest: ['distributed/fsdp/test_fsdp_optim_state.py'] 2022-11-23T03:25:40.9198425Z 2022-11-23T03:25:40.9198959Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:25:40.9199893Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:25:40.9204902Z parallel (file granularity) tests: 2022-11-23T03:25:40.9205646Z 2022-11-23T03:25:40.9206372Z serial (file granularity) tests: 2022-11-23T03:25:40.9206846Z distributed/fsdp/test_fsdp_optim_state.py 2022-11-23T03:25:43.1828247Z Ignoring disabled issues: [] 2022-11-23T03:25:43.5470086Z Running distributed/fsdp/test_fsdp_optim_state.py ... [2022-11-23 03:25:43.546467] 2022-11-23T03:25:43.5473043Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:25:43.546953] 2022-11-23T03:30:20.5590835Z 2022-11-23T03:30:20.5591465Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_optim_state 2022-11-23T03:30:20.5592456Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_optim_state_ulxfapac) 2022-11-23T03:30:20.5595333Z 2022-11-23T03:30:20.5595642Z Running tests... 2022-11-23T03:30:20.5599874Z ---------------------------------------------------------------------- 2022-11-23T03:30:20.5600521Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state 2022-11-23T03:30:20.5601259Z test_flatten_sharded_optim_state_dict_nested (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5602205Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:30:20.5603011Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55815 2022-11-23T03:30:20.5603497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55816 2022-11-23T03:30:20.5604103Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55817 2022-11-23T03:30:20.5604870Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55818 2022-11-23T03:30:20.5605759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5606221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5606813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5607290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5607853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5608301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5608876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5609345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5612008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5612476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5613059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5613534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5614139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5614763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5615383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5615850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5616292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5616820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5617304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5617793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5618437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5619145Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5619832Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5620676Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5621258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5621741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5622210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5622680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5623531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5624647Z warnings.warn( 2022-11-23T03:30:20.5625422Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5625976Z warnings.warn( 2022-11-23T03:30:20.5626730Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5627249Z warnings.warn( 2022-11-23T03:30:20.5628003Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5628523Z warnings.warn( 2022-11-23T03:30:20.5629323Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5629871Z warnings.warn( 2022-11-23T03:30:20.5630669Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5631222Z warnings.warn( 2022-11-23T03:30:20.5632118Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5632688Z warnings.warn( 2022-11-23T03:30:20.5633461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5633990Z warnings.warn( 2022-11-23T03:30:20.5634248Z dist init r=1, world=4 2022-11-23T03:30:20.5634477Z dist init r=3, world=4 2022-11-23T03:30:20.5634728Z dist init r=2, world=4 2022-11-23T03:30:20.5634980Z dist init r=0, world=4 2022-11-23T03:30:20.5635192Z ok (6.943s) 2022-11-23T03:30:20.5635544Z test_flatten_sharded_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5636242Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56116 2022-11-23T03:30:20.5636787Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56117 2022-11-23T03:30:20.5637219Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56118 2022-11-23T03:30:20.5637657Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56119 2022-11-23T03:30:20.5638268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5638788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5639374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5639856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5640427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5640875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5641453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5641916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5642499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5642936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5643502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5643985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5644533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5644989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5645570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5646036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5646478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5646967Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5647470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5647955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5648702Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5649479Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5650179Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5650854Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5651342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5651820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5652272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5652749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5653596Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5654151Z warnings.warn( 2022-11-23T03:30:20.5654907Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5655442Z warnings.warn( 2022-11-23T03:30:20.5656227Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5656764Z warnings.warn( 2022-11-23T03:30:20.5657503Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5658033Z warnings.warn( 2022-11-23T03:30:20.5658265Z dist init r=3, world=4 2022-11-23T03:30:20.5658517Z dist init r=0, world=4 2022-11-23T03:30:20.5658775Z dist init r=2, world=4 2022-11-23T03:30:20.5658990Z dist init r=1, world=4 2022-11-23T03:30:20.5659224Z ok (6.122s) 2022-11-23T03:30:20.5659537Z test_full_optim_state_dict_keys (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5660009Z Tests that the parameter keys returned by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56417 2022-11-23T03:30:20.5660523Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56418 2022-11-23T03:30:20.5660973Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56419 2022-11-23T03:30:20.5661395Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56420 2022-11-23T03:30:20.5662004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5662453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5663026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5663475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5664321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5664781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5665336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5665803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5666373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5666813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5667451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5667935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5668509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5668954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5669501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5669958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5670408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5670880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5671369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5671857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5672515Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5673250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5673927Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5674611Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5675122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5675576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5676030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5676496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5676830Z dist init r=2, world=4 2022-11-23T03:30:20.5677081Z dist init r=0, world=4 2022-11-23T03:30:20.5677333Z dist init r=3, world=4 2022-11-23T03:30:20.5677561Z dist init r=1, world=4 2022-11-23T03:30:20.5677808Z ok (5.020s) 2022-11-23T03:30:20.5678120Z test_full_optim_state_dict_nested_invalid (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5678641Z Tests that :meth:`full_optim_state_dict` raises an error when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56718 2022-11-23T03:30:20.5679220Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56719 2022-11-23T03:30:20.5679668Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56720 2022-11-23T03:30:20.5680091Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56721 2022-11-23T03:30:20.5680690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5681134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5681683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5682148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5682725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5683166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5683715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5684256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5684844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5685286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5685833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5686296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5686872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5687296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5687864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5688324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5688770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5689244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5689728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5690267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5690916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5691581Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5692258Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5692932Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5693426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5693893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5694352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5694811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5695148Z dist init r=3, world=4 2022-11-23T03:30:20.5695396Z dist init r=0, world=4 2022-11-23T03:30:20.5695643Z dist init r=1, world=4 2022-11-23T03:30:20.5695870Z dist init r=2, world=4 2022-11-23T03:30:20.5696101Z ok (4.920s) 2022-11-23T03:30:20.5696407Z test_optim_input_warning (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5696897Z Tests that passing the ``optim_input`` argument into optimizer state ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57019 2022-11-23T03:30:20.5697430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57020 2022-11-23T03:30:20.5697871Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57021 2022-11-23T03:30:20.5698307Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57022 2022-11-23T03:30:20.5698906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5699350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5699918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5700383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5700985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5701432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5701994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5702436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5703010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5703448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5704344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5704790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5705375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5705824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5706359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5706816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5707263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5707841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5708306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5708788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5709499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5710189Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5710851Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5711520Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5712038Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5712506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5712947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5713401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5713751Z dist init r=0, world=4 2022-11-23T03:30:20.5713988Z dist init r=3, world=4 2022-11-23T03:30:20.5714234Z dist init r=2, world=4 2022-11-23T03:30:20.5714476Z dist init r=1, world=4 2022-11-23T03:30:20.5714692Z ok (5.120s) 2022-11-23T03:30:20.5715171Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5715830Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57320 2022-11-23T03:30:20.5716368Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57321 2022-11-23T03:30:20.5716792Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57322 2022-11-23T03:30:20.5717223Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57323 2022-11-23T03:30:20.5717924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5718367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5718941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5719407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5719979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5720411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5720975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5721436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5721991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5722434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5722988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5723430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5723977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5724504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5725086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5725544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5725975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5726472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5726956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5727416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5728066Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5728755Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5729428Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5730081Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5730588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5731059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5731519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5731960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5732851Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5733418Z warnings.warn( 2022-11-23T03:30:20.5734207Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5734729Z warnings.warn( 2022-11-23T03:30:20.5735558Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5736122Z warnings.warn( 2022-11-23T03:30:20.5736898Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5737417Z warnings.warn( 2022-11-23T03:30:20.5737661Z dist init r=2, world=4 2022-11-23T03:30:20.5737911Z dist init r=3, world=4 2022-11-23T03:30:20.5738137Z dist init r=0, world=4 2022-11-23T03:30:20.5738377Z dist init r=1, world=4 2022-11-23T03:30:20.5738607Z ok (5.020s) 2022-11-23T03:30:20.5739073Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5739738Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57621 2022-11-23T03:30:20.5740265Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57622 2022-11-23T03:30:20.5740770Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57623 2022-11-23T03:30:20.5741189Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57624 2022-11-23T03:30:20.5741797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5742240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5742805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5743257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5743831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5744496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5745050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5745518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5746084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5746519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5747068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5747535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5748101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5748533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5749079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5749543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5749989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5750462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5750944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5751503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5752168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5752834Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5753509Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5754280Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5754789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5755236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5755690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5756155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5757053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5757675Z warnings.warn( 2022-11-23T03:30:20.5758456Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5759006Z warnings.warn( 2022-11-23T03:30:20.5759787Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5760312Z warnings.warn( 2022-11-23T03:30:20.5761086Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5761630Z warnings.warn( 2022-11-23T03:30:20.5761884Z dist init r=3, world=4 2022-11-23T03:30:20.5762114Z dist init r=0, world=4 2022-11-23T03:30:20.5762357Z dist init r=2, world=4 2022-11-23T03:30:20.5762599Z dist init r=1, world=4 2022-11-23T03:30:20.5762813Z ok (5.020s) 2022-11-23T03:30:20.5763289Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5763950Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57922 2022-11-23T03:30:20.5764463Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57923 2022-11-23T03:30:20.5764908Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57924 2022-11-23T03:30:20.5765341Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57925 2022-11-23T03:30:20.5765949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5766386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5766959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5767424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5768014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5768511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5769090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5769546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5770095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5770590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5771163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5771607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5772159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5772623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5773204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5773644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5774095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5774640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5775128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5775594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5776244Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5776928Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5777603Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5778256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5778772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5779303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5779766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5780215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5781112Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5781671Z warnings.warn( 2022-11-23T03:30:20.5782435Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5782978Z warnings.warn( 2022-11-23T03:30:20.5785115Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5785758Z warnings.warn( 2022-11-23T03:30:20.5786715Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5787254Z warnings.warn( 2022-11-23T03:30:20.5787497Z dist init r=1, world=4 2022-11-23T03:30:20.5787744Z dist init r=2, world=4 2022-11-23T03:30:20.5787969Z dist init r=3, world=4 2022-11-23T03:30:20.5788213Z dist init r=0, world=4 2022-11-23T03:30:20.5788444Z ok (5.020s) 2022-11-23T03:30:20.5788913Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5789577Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58223 2022-11-23T03:30:20.5790109Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58224 2022-11-23T03:30:20.5790554Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58225 2022-11-23T03:30:20.5790974Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58226 2022-11-23T03:30:20.5791580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5792024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5792596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5793134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5793709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5794153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5794703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5795173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5795738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5796175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5796723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5797188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5797758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5798193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5798741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5799203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5799649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5800121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5800603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5801090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5801740Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5802407Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5803085Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5803812Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5804332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5804782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5805233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5805700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5806586Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5807140Z warnings.warn( 2022-11-23T03:30:20.5807928Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5808479Z warnings.warn( 2022-11-23T03:30:20.5809317Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5809914Z warnings.warn( 2022-11-23T03:30:20.5810698Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5811240Z warnings.warn( 2022-11-23T03:30:20.5811465Z dist init r=1, world=4 2022-11-23T03:30:20.5811721Z dist init r=3, world=4 2022-11-23T03:30:20.5811968Z dist init r=2, world=4 2022-11-23T03:30:20.5812192Z dist init r=0, world=4 2022-11-23T03:30:20.5812423Z ok (5.120s) 2022-11-23T03:30:20.5812900Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5813562Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58524 2022-11-23T03:30:20.5814080Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58525 2022-11-23T03:30:20.5814524Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58526 2022-11-23T03:30:20.5814964Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58527 2022-11-23T03:30:20.5815589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5816025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5816604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5817090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5817641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5818098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5818678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5819137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5819692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5820190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5820772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5821224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5821777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5822223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5822782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5823219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5823667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5824450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5824946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5825419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5826077Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5826859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5827535Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5828199Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5828714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5829186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5829633Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5830091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5830988Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5831546Z warnings.warn( 2022-11-23T03:30:20.5832315Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5832851Z warnings.warn( 2022-11-23T03:30:20.5833639Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5834188Z warnings.warn( 2022-11-23T03:30:20.5834949Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5835493Z warnings.warn( 2022-11-23T03:30:20.5835738Z dist init r=1, world=4 2022-11-23T03:30:20.5835987Z dist init r=2, world=4 2022-11-23T03:30:20.5836296Z dist init r=0, world=4 2022-11-23T03:30:20.5836629Z dist init r=3, world=4 2022-11-23T03:30:20.5836861Z ok (5.120s) 2022-11-23T03:30:20.5837407Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5838085Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58825 2022-11-23T03:30:20.5838615Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58826 2022-11-23T03:30:20.5839064Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58827 2022-11-23T03:30:20.5839486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58828 2022-11-23T03:30:20.5840089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5840538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5841090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5841560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5842133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5842576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5843126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5843669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5844243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5844679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5845227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5845686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5846253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5846669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5847232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5847692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5848137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5848611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5849097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5849584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5850218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5850903Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5851585Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5852270Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5852763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5853236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5853689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5854206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5855103Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5855667Z warnings.warn( 2022-11-23T03:30:20.5856460Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5857010Z warnings.warn( 2022-11-23T03:30:20.5857902Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5858444Z warnings.warn( 2022-11-23T03:30:20.5859223Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5859827Z warnings.warn( 2022-11-23T03:30:20.5860055Z dist init r=2, world=4 2022-11-23T03:30:20.5860301Z dist init r=0, world=4 2022-11-23T03:30:20.5860547Z dist init r=1, world=4 2022-11-23T03:30:20.5860775Z dist init r=3, world=4 2022-11-23T03:30:20.5861003Z ok (5.120s) 2022-11-23T03:30:20.5861478Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5862140Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59126 2022-11-23T03:30:20.5862659Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59127 2022-11-23T03:30:20.5863100Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 59128 2022-11-23T03:30:20.5863540Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 59129 2022-11-23T03:30:20.5864422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5864882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5865456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5865916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5866472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5866924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5867491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5867948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5868505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5868952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5869518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5869947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5870520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5871070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5871720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5872165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5872624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5873129Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5873599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5874090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5874750Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5875449Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5876118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5876794Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5877398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5877865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5878308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5878762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5879738Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5880308Z warnings.warn( 2022-11-23T03:30:20.5881089Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5881695Z warnings.warn( 2022-11-23T03:30:20.5882490Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5883040Z warnings.warn( 2022-11-23T03:30:20.5883807Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5884354Z warnings.warn( 2022-11-23T03:30:20.5884610Z dist init r=1, world=4 2022-11-23T03:30:20.5884843Z dist init r=2, world=4 2022-11-23T03:30:20.5885096Z dist init r=3, world=4 2022-11-23T03:30:20.5885351Z dist init r=0, world=4 2022-11-23T03:30:20.5885571Z ok (5.120s) 2022-11-23T03:30:20.5886059Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5886734Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59427 2022-11-23T03:30:20.5887270Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59428 2022-11-23T03:30:20.5887756Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 59429 2022-11-23T03:30:20.5888217Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 59430 2022-11-23T03:30:20.5888830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5889283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5889839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5890309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5890891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5891342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5891898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5892373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5892955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5893377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5893953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5894485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5895075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5895498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5896068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5896537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5896975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5897481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5897974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5898469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5899112Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5899806Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5900497Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5901181Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5901684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5902165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5902634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5903091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5904272Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5904854Z warnings.warn( 2022-11-23T03:30:20.5905736Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5906299Z warnings.warn( 2022-11-23T03:30:20.5907070Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5907629Z warnings.warn( 2022-11-23T03:30:20.5908414Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5908957Z warnings.warn( 2022-11-23T03:30:20.5909191Z dist init r=1, world=4 2022-11-23T03:30:20.5909508Z dist init r=3, world=4 2022-11-23T03:30:20.5909766Z dist init r=2, world=4 2022-11-23T03:30:20.5909997Z dist init r=0, world=4 2022-11-23T03:30:20.5910233Z ok (5.019s) 2022-11-23T03:30:20.5910716Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5911361Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59728 2022-11-23T03:30:20.5911968Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59729 2022-11-23T03:30:20.5912409Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 59730 2022-11-23T03:30:20.5912847Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 59731 2022-11-23T03:30:20.5913445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5913892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5914458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5914924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5915479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5915927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5916496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5916937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5917508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5917954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5918513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5918951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5919523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5919969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5920517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5920972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5921419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5921999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5922471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5922952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5923606Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5924293Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5924954Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5925632Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5926152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5926622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5927066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5927520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5928381Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5928967Z warnings.warn( 2022-11-23T03:30:20.5929719Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5930272Z warnings.warn( 2022-11-23T03:30:20.5931017Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5931528Z warnings.warn( 2022-11-23T03:30:20.5932276Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5932811Z warnings.warn( 2022-11-23T03:30:20.5933593Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5934129Z warnings.warn( 2022-11-23T03:30:20.5934913Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5935459Z warnings.warn( 2022-11-23T03:30:20.5936233Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5936776Z warnings.warn( 2022-11-23T03:30:20.5937531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5938080Z warnings.warn( 2022-11-23T03:30:20.5938322Z dist init r=1, world=4 2022-11-23T03:30:20.5938556Z dist init r=3, world=4 2022-11-23T03:30:20.5938849Z dist init r=0, world=4 2022-11-23T03:30:20.5939101Z dist init r=2, world=4 2022-11-23T03:30:20.5939315Z ok (5.119s) 2022-11-23T03:30:20.5939798Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5940457Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60029 2022-11-23T03:30:20.5940990Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60030 2022-11-23T03:30:20.5941418Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 60031 2022-11-23T03:30:20.5941854Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 60032 2022-11-23T03:30:20.5942464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5942894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5943468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5944173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5944773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5945323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5945895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5946355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5946923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5947347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5947917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5948376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5949183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5949627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5950197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5950654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5951083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5951583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5952066Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5952548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5953187Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5953880Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5954559Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5955216Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5955751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5956316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5956795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5957243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5958105Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5958656Z warnings.warn( 2022-11-23T03:30:20.5959412Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5959930Z warnings.warn( 2022-11-23T03:30:20.5960674Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5961203Z warnings.warn( 2022-11-23T03:30:20.5961946Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.5962514Z warnings.warn( 2022-11-23T03:30:20.5963301Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5963850Z warnings.warn( 2022-11-23T03:30:20.5964634Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5965159Z warnings.warn( 2022-11-23T03:30:20.5965938Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5966485Z warnings.warn( 2022-11-23T03:30:20.5967261Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.5967781Z warnings.warn( 2022-11-23T03:30:20.5968025Z dist init r=2, world=4 2022-11-23T03:30:20.5968271Z dist init r=0, world=4 2022-11-23T03:30:20.5968501Z dist init r=1, world=4 2022-11-23T03:30:20.5968750Z dist init r=3, world=4 2022-11-23T03:30:20.5968980Z ok (5.120s) 2022-11-23T03:30:20.5969461Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5970115Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60330 2022-11-23T03:30:20.5970658Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60331 2022-11-23T03:30:20.5971101Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 60332 2022-11-23T03:30:20.5971521Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 60333 2022-11-23T03:30:20.5972127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5972627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5973209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5973657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5974231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5974677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5975248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5975692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5976258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5976698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5977251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5977710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5978277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5978769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5979379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5979846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5980293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.5980766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5981253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.5981734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.5982379Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5983049Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5983731Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5984623Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.5985142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.5985597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.5986050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.5986511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.5986863Z dist init r=3, world=4 2022-11-23T03:30:20.5987094Z dist init r=0, world=4 2022-11-23T03:30:20.5987343Z dist init r=1, world=4 2022-11-23T03:30:20.5987587Z dist init r=2, world=4 2022-11-23T03:30:20.5987802Z ok (4.317s) 2022-11-23T03:30:20.5988285Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.5988967Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60615 2022-11-23T03:30:20.5989550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60616 2022-11-23T03:30:20.5990006Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 60617 2022-11-23T03:30:20.5990439Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 60618 2022-11-23T03:30:20.5991047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5991478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5992049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5992509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5993069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5993509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5994078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5994538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5995092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5995535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5996180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5996638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5997191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.5997629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.5998190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.5998627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.5999073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.5999563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6000052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6000515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6001164Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6001845Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6002523Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6003176Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6003692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6004164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6004607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6005073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6005417Z dist init r=3, world=4 2022-11-23T03:30:20.6005664Z dist init r=0, world=4 2022-11-23T03:30:20.6005891Z dist init r=2, world=4 2022-11-23T03:30:20.6006137Z dist init r=1, world=4 2022-11-23T03:30:20.6006369Z ok (4.318s) 2022-11-23T03:30:20.6006884Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6007560Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60900 2022-11-23T03:30:20.6008097Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60901 2022-11-23T03:30:20.6008541Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 60902 2022-11-23T03:30:20.6008959Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 60903 2022-11-23T03:30:20.6009618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6010065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6010625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6011090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6011664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6012105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6012724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6013181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6013751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6014169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6014733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6015191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6015760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6016179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6016743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6017198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6017644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6018120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6018603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6019089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6019719Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6020402Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6021079Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6021752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6022245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6022709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6023258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6023729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6024853Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6025409Z warnings.warn( 2022-11-23T03:30:20.6026159Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6026691Z warnings.warn( 2022-11-23T03:30:20.6027416Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6027951Z warnings.warn( 2022-11-23T03:30:20.6028689Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6029226Z warnings.warn( 2022-11-23T03:30:20.6030093Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6030647Z warnings.warn( 2022-11-23T03:30:20.6031440Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6031985Z warnings.warn( 2022-11-23T03:30:20.6032742Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6033282Z warnings.warn( 2022-11-23T03:30:20.6034075Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6034624Z warnings.warn( 2022-11-23T03:30:20.6034853Z dist init r=2, world=4 2022-11-23T03:30:20.6035098Z dist init r=3, world=4 2022-11-23T03:30:20.6035340Z dist init r=1, world=4 2022-11-23T03:30:20.6035564Z dist init r=0, world=4 2022-11-23T03:30:20.6035794Z ok (5.220s) 2022-11-23T03:30:20.6036278Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6036943Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61201 2022-11-23T03:30:20.6037457Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61202 2022-11-23T03:30:20.6037906Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 61203 2022-11-23T03:30:20.6038346Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 61204 2022-11-23T03:30:20.6038940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6039387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6040029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6040507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6041069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6041513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6042086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6042531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6043104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6043542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6044107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6044547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6045112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6045553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6046115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6046618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6047065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6047555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6048020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6048508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6049157Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6049889Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6050562Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6051241Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6051756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6052222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6052671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6053123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6053986Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6054533Z warnings.warn( 2022-11-23T03:30:20.6055262Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6055795Z warnings.warn( 2022-11-23T03:30:20.6056574Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6057116Z warnings.warn( 2022-11-23T03:30:20.6057849Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6058377Z warnings.warn( 2022-11-23T03:30:20.6059171Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6059717Z warnings.warn( 2022-11-23T03:30:20.6060482Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6061033Z warnings.warn( 2022-11-23T03:30:20.6061806Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6062341Z warnings.warn( 2022-11-23T03:30:20.6063169Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6063721Z warnings.warn( 2022-11-23T03:30:20.6064197Z dist init r=1, world=4 2022-11-23T03:30:20.6064430Z dist init r=0, world=4 2022-11-23T03:30:20.6064675Z dist init r=3, world=4 2022-11-23T03:30:20.6064919Z dist init r=2, world=4 2022-11-23T03:30:20.6065133Z ok (5.119s) 2022-11-23T03:30:20.6065618Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6066280Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61502 2022-11-23T03:30:20.6066808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61503 2022-11-23T03:30:20.6067236Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 61504 2022-11-23T03:30:20.6067666Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 61505 2022-11-23T03:30:20.6068276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6068729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6069281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6069746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6070323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6070748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6071321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6071792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6072362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6072789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6073351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6073904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6074490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6074911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6075477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6075939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6076373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6076870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6077356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6077843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6078484Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6079167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6079986Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6080741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6081238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6081705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6082170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6082614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6082969Z dist init r=3, world=4 2022-11-23T03:30:20.6083222Z dist init r=2, world=4 2022-11-23T03:30:20.6083469Z dist init r=0, world=4 2022-11-23T03:30:20.6083696Z dist init r=1, world=4 2022-11-23T03:30:20.6083930Z ok (4.317s) 2022-11-23T03:30:20.6084418Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6085071Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61787 2022-11-23T03:30:20.6085602Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61788 2022-11-23T03:30:20.6086051Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 61789 2022-11-23T03:30:20.6086493Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 61790 2022-11-23T03:30:20.6087088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6087536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6088103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6088554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6089128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6089573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6090134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6090630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6091211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6091647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6092195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6092660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6093230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6093666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6094214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6094673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6095123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6095615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6096084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6096629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6097279Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6097940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6098618Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6099298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6099810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6100259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6100714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6101179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6101530Z dist init r=3, world=4 2022-11-23T03:30:20.6101760Z dist init r=2, world=4 2022-11-23T03:30:20.6102005Z dist init r=0, world=4 2022-11-23T03:30:20.6102248Z dist init r=1, world=4 2022-11-23T03:30:20.6102465Z ok (4.418s) 2022-11-23T03:30:20.6102897Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6103509Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62072 2022-11-23T03:30:20.6104269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62073 2022-11-23T03:30:20.6104729Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 62074 2022-11-23T03:30:20.6105159Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 62075 2022-11-23T03:30:20.6105772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6106202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6106770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6107235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6107872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6108330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6108900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6109367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6109970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6110413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6110976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6111429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6111983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6112422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6112985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6113421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6113952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6114440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6114922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6115385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6116040Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6116722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6117399Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6118053Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6118574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6119040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6119544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6120010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6120906Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6121464Z warnings.warn( 2022-11-23T03:30:20.6122237Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6122781Z warnings.warn( 2022-11-23T03:30:20.6123560Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6124102Z warnings.warn( 2022-11-23T03:30:20.6125019Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6125550Z warnings.warn( 2022-11-23T03:30:20.6125799Z dist init r=1, world=4 2022-11-23T03:30:20.6126044Z dist init r=0, world=4 2022-11-23T03:30:20.6126277Z dist init r=2, world=4 2022-11-23T03:30:20.6126519Z dist init r=3, world=4 2022-11-23T03:30:20.6126751Z ok (5.119s) 2022-11-23T03:30:20.6127165Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6127769Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62373 2022-11-23T03:30:20.6128283Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62374 2022-11-23T03:30:20.6128732Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 62375 2022-11-23T03:30:20.6129152Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 62376 2022-11-23T03:30:20.6129756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6130202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6130822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6131287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6131860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6132298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6132852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6133312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6133883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6134322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6134872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6135333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6135900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6136320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6136882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6137341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6137785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6138255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6138487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6138719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6139118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6139510Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6139948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6140347Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6140577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6140785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6141009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6141227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6141882Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6141997Z warnings.warn( 2022-11-23T03:30:20.6142649Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6142760Z warnings.warn( 2022-11-23T03:30:20.6143397Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6143570Z warnings.warn( 2022-11-23T03:30:20.6144411Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6144510Z warnings.warn( 2022-11-23T03:30:20.6144623Z dist init r=3, world=4 2022-11-23T03:30:20.6144732Z dist init r=2, world=4 2022-11-23T03:30:20.6144839Z dist init r=1, world=4 2022-11-23T03:30:20.6144945Z dist init r=0, world=4 2022-11-23T03:30:20.6145043Z ok (5.119s) 2022-11-23T03:30:20.6145355Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6145642Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62674 2022-11-23T03:30:20.6145864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62675 2022-11-23T03:30:20.6146071Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 62676 2022-11-23T03:30:20.6146277Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 62677 2022-11-23T03:30:20.6146660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6146836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6147212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6147402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6147761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6147920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6148295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6148484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6148849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6149100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6149492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6149681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6150044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6150203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6150576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6150765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6151009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6151248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6151478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6151706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6152105Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6152579Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6152950Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6153342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6153572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6153803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6154023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6154243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6154865Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6154981Z warnings.warn( 2022-11-23T03:30:20.6155594Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6155704Z warnings.warn( 2022-11-23T03:30:20.6156291Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6156405Z warnings.warn( 2022-11-23T03:30:20.6157007Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6157119Z warnings.warn( 2022-11-23T03:30:20.6157766Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6157875Z warnings.warn( 2022-11-23T03:30:20.6158572Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6158687Z warnings.warn( 2022-11-23T03:30:20.6159331Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6159444Z warnings.warn( 2022-11-23T03:30:20.6160066Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6160175Z warnings.warn( 2022-11-23T03:30:20.6160285Z dist init r=1, world=4 2022-11-23T03:30:20.6160393Z dist init r=3, world=4 2022-11-23T03:30:20.6160505Z dist init r=2, world=4 2022-11-23T03:30:20.6160612Z dist init r=0, world=4 2022-11-23T03:30:20.6160711Z ok (5.120s) 2022-11-23T03:30:20.6161018Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6161300Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62975 2022-11-23T03:30:20.6161572Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62976 2022-11-23T03:30:20.6161784Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 62977 2022-11-23T03:30:20.6161993Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 62978 2022-11-23T03:30:20.6162368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6162546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6162928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6163117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6163461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6163638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6164006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6164179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6164558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6164749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6165122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6165309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6165672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6165831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6166205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6166394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6166638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6166873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6167167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6167406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6167805Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6168202Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6168576Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6168963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6169194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6169423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6169656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6169866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6170599Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6170768Z warnings.warn( 2022-11-23T03:30:20.6171377Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6171469Z warnings.warn( 2022-11-23T03:30:20.6172070Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6172178Z warnings.warn( 2022-11-23T03:30:20.6172772Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6172883Z warnings.warn( 2022-11-23T03:30:20.6173527Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6173634Z warnings.warn( 2022-11-23T03:30:20.6174280Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6174388Z warnings.warn( 2022-11-23T03:30:20.6175022Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6175132Z warnings.warn( 2022-11-23T03:30:20.6175753Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6175860Z warnings.warn( 2022-11-23T03:30:20.6175969Z dist init r=2, world=4 2022-11-23T03:30:20.6176077Z dist init r=0, world=4 2022-11-23T03:30:20.6176184Z dist init r=3, world=4 2022-11-23T03:30:20.6176376Z dist init r=1, world=4 2022-11-23T03:30:20.6176483Z ok (5.120s) 2022-11-23T03:30:20.6176663Z test_rekey_optim_state_dict_to_names (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6176962Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63276 2022-11-23T03:30:20.6177181Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63277 2022-11-23T03:30:20.6177400Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 63278 2022-11-23T03:30:20.6177605Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 63279 2022-11-23T03:30:20.6177981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6178157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6178531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6178705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6179071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6179241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6179735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6179924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6180282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6180457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6180832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6181024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6181371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6181543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6181914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6182106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6182349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6182584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6182812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6183042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6183441Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6183818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6184477Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6184871Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6185100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6185326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6185621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6185852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6186510Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6186627Z warnings.warn( 2022-11-23T03:30:20.6187278Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6187371Z warnings.warn( 2022-11-23T03:30:20.6188014Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6188122Z warnings.warn( 2022-11-23T03:30:20.6188762Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6188944Z warnings.warn( 2022-11-23T03:30:20.6189055Z dist init r=3, world=4 2022-11-23T03:30:20.6189162Z dist init r=1, world=4 2022-11-23T03:30:20.6189270Z dist init r=2, world=4 2022-11-23T03:30:20.6189359Z dist init r=0, world=4 2022-11-23T03:30:20.6189458Z ok (5.220s) 2022-11-23T03:30:20.6189728Z test_save_load_without_0th_param_state_state_dict_type_StateDictType_FULL_STATE_DICT (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6190044Z Tests saving and loading an optim state dict for Adam optimizer (i.e. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63577 2022-11-23T03:30:20.6190261Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63578 2022-11-23T03:30:20.6190470Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 63579 2022-11-23T03:30:20.6190677Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 63580 2022-11-23T03:30:20.6191053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6191217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6191599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6191793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6192160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6192337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6192713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6192900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6193262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6193439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6193796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6193984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6194343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6194562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6194949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6195135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6195379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6195616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6195828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6196229Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6196470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6196862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6197255Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6197644Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6197874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6198155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6198375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6198576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6198690Z dist init r=1, world=4 2022-11-23T03:30:20.6198798Z dist init r=3, world=4 2022-11-23T03:30:20.6198904Z dist init r=0, world=4 2022-11-23T03:30:20.6199014Z dist init r=2, world=4 2022-11-23T03:30:20.6199114Z ok (4.820s) 2022-11-23T03:30:20.6199384Z test_save_load_without_0th_param_state_state_dict_type_StateDictType_SHARDED_STATE_DICT (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6199695Z Tests saving and loading an optim state dict for Adam optimizer (i.e. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63878 2022-11-23T03:30:20.6199902Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63879 2022-11-23T03:30:20.6200111Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 63880 2022-11-23T03:30:20.6200316Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 63881 2022-11-23T03:30:20.6200694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6200868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6201244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6201434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6201796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6201951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6202326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6202517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6202877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6203048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6203468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6203662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6204027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6204199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6204556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6204743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6204985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6205218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6205446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6205675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6206070Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6206465Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6206910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6207287Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6207516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6207740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6207964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6208183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6208801Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6208916Z warnings.warn( 2022-11-23T03:30:20.6209570Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6209685Z warnings.warn( 2022-11-23T03:30:20.6210274Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6210385Z warnings.warn( 2022-11-23T03:30:20.6210979Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6211091Z warnings.warn( 2022-11-23T03:30:20.6211201Z dist init r=3, world=4 2022-11-23T03:30:20.6211310Z dist init r=1, world=4 2022-11-23T03:30:20.6211416Z dist init r=2, world=4 2022-11-23T03:30:20.6211522Z dist init r=0, world=4 2022-11-23T03:30:20.6211604Z ok (4.919s) 2022-11-23T03:30:20.6211832Z test_scatter_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6212278Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64179 2022-11-23T03:30:20.6212548Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64180 2022-11-23T03:30:20.6212769Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 64181 2022-11-23T03:30:20.6212975Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 64182 2022-11-23T03:30:20.6213347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6213525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6213883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6214075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6214442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6214617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6214991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6215180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6215544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6215770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6216148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6216317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6216680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6216852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6217226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6217415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6217658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6217890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6218124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6218351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6218729Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6219118Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6219510Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6219900Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6220129Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6220358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6220579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6220795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6221032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:30:20.6221247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:30:20.6221528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:30:20.6221761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:30:20.6222156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6222547Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6222937Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6223325Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6224324Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6224447Z warnings.warn( 2022-11-23T03:30:20.6225106Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6225281Z warnings.warn( 2022-11-23T03:30:20.6225937Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6226041Z warnings.warn( 2022-11-23T03:30:20.6226686Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6226795Z warnings.warn( 2022-11-23T03:30:20.6227037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:30:20.6227272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:30:20.6227504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:30:20.6227731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:30:20.6228107Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6228497Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6228892Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6229280Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6229391Z dist init r=2, world=4 2022-11-23T03:30:20.6229498Z dist init r=0, world=4 2022-11-23T03:30:20.6229607Z dist init r=3, world=4 2022-11-23T03:30:20.6229712Z dist init r=1, world=4 2022-11-23T03:30:20.6229796Z ok (5.520s) 2022-11-23T03:30:20.6230106Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6230550Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64498 2022-11-23T03:30:20.6230771Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64499 2022-11-23T03:30:20.6231056Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 64500 2022-11-23T03:30:20.6231277Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 64501 2022-11-23T03:30:20.6231649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6231824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6232207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6232381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6232746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6232920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6233303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6233492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6233856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6234027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6234463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6234633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6234993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6235168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6235540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6235732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6235976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6236210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6236438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6236669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6237047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6237440Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6237829Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6238222Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6238452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6238677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6238901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6239119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6239772Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6239884Z warnings.warn( 2022-11-23T03:30:20.6240592Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6240711Z warnings.warn( 2022-11-23T03:30:20.6241355Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6241469Z warnings.warn( 2022-11-23T03:30:20.6242108Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6242217Z warnings.warn( 2022-11-23T03:30:20.6242330Z dist init r=2, world=4 2022-11-23T03:30:20.6242438Z dist init r=3, world=4 2022-11-23T03:30:20.6242527Z dist init r=0, world=4 2022-11-23T03:30:20.6242633Z dist init r=1, world=4 2022-11-23T03:30:20.6242732Z ok (5.320s) 2022-11-23T03:30:20.6243038Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6243537Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64799 2022-11-23T03:30:20.6243755Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64800 2022-11-23T03:30:20.6243966Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 64801 2022-11-23T03:30:20.6244171Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 64802 2022-11-23T03:30:20.6244608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6244767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6245145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6245334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6245701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6245874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6246252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6246443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6246807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6246966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6247345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6247534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6247899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6248074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6248446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6248739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6248986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6249273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6249490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6249728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6250126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6250529Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6250921Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6251311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6251540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6251877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6252095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6252298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6253017Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6253132Z warnings.warn( 2022-11-23T03:30:20.6253782Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6253894Z warnings.warn( 2022-11-23T03:30:20.6254531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6254638Z warnings.warn( 2022-11-23T03:30:20.6255272Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6255383Z warnings.warn( 2022-11-23T03:30:20.6255493Z dist init r=3, world=4 2022-11-23T03:30:20.6255583Z dist init r=1, world=4 2022-11-23T03:30:20.6255690Z dist init r=2, world=4 2022-11-23T03:30:20.6255795Z dist init r=0, world=4 2022-11-23T03:30:20.6255893Z ok (5.421s) 2022-11-23T03:30:20.6256204Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6256645Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65100 2022-11-23T03:30:20.6256863Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65101 2022-11-23T03:30:20.6257059Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65102 2022-11-23T03:30:20.6257267Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65103 2022-11-23T03:30:20.6257635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6257808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6258182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6258427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6258803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6258975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6259348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6259523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6259885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6260157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6260534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6260726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6261090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6261262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6261633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6261863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6262114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6262348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6262578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6262810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6263210Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6263601Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6264195Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6264611Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6264945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6265156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6265378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6265600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6266251Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6266368Z warnings.warn( 2022-11-23T03:30:20.6267017Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6267126Z warnings.warn( 2022-11-23T03:30:20.6267839Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6267957Z warnings.warn( 2022-11-23T03:30:20.6268614Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6268705Z warnings.warn( 2022-11-23T03:30:20.6268821Z dist init r=0, world=4 2022-11-23T03:30:20.6268929Z dist init r=3, world=4 2022-11-23T03:30:20.6269036Z dist init r=1, world=4 2022-11-23T03:30:20.6269142Z dist init r=2, world=4 2022-11-23T03:30:20.6269241Z ok (5.220s) 2022-11-23T03:30:20.6269544Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6269971Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65401 2022-11-23T03:30:20.6270200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65402 2022-11-23T03:30:20.6270411Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65403 2022-11-23T03:30:20.6270618Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65404 2022-11-23T03:30:20.6270988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6271234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6271618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6271807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6272172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6272333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6272705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6272898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6273264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6273441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6273816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6274003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6274365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6274519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6274896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6275084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6275327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6275571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6275811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6276040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6276438Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6276881Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6277269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6277659Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6277915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6278145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6278362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6278581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6279242Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6279408Z warnings.warn( 2022-11-23T03:30:20.6280067Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6280234Z warnings.warn( 2022-11-23T03:30:20.6280864Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6280974Z warnings.warn( 2022-11-23T03:30:20.6281616Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6281731Z warnings.warn( 2022-11-23T03:30:20.6281840Z dist init r=0, world=4 2022-11-23T03:30:20.6281947Z dist init r=1, world=4 2022-11-23T03:30:20.6282053Z dist init r=3, world=4 2022-11-23T03:30:20.6282159Z dist init r=2, world=4 2022-11-23T03:30:20.6282239Z ok (5.320s) 2022-11-23T03:30:20.6282548Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6282995Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65702 2022-11-23T03:30:20.6283214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65703 2022-11-23T03:30:20.6283431Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65704 2022-11-23T03:30:20.6283641Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65705 2022-11-23T03:30:20.6284010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6284184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6284542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6284739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6285106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6285283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6285655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6285845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6286253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6286433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6286812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6286984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6287350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6287523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6287896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6288085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6288330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6288566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6288798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6289026Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6289467Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6289862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6290250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6290640Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6290865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6291084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6291307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6291527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6292181Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6292276Z warnings.warn( 2022-11-23T03:30:20.6292929Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6293041Z warnings.warn( 2022-11-23T03:30:20.6293682Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6293793Z warnings.warn( 2022-11-23T03:30:20.6294431Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6294541Z warnings.warn( 2022-11-23T03:30:20.6294651Z dist init r=1, world=4 2022-11-23T03:30:20.6294758Z dist init r=2, world=4 2022-11-23T03:30:20.6294848Z dist init r=3, world=4 2022-11-23T03:30:20.6295018Z dist init r=0, world=4 2022-11-23T03:30:20.6295125Z ok (5.320s) 2022-11-23T03:30:20.6295432Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6295874Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66003 2022-11-23T03:30:20.6296096Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66004 2022-11-23T03:30:20.6296309Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66005 2022-11-23T03:30:20.6296515Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66006 2022-11-23T03:30:20.6296865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6297040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6297416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6297606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6297969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6298141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6298574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6298759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6299116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6299270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6299648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6299837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6300201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6300374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6300753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6300943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6301184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6301402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6301637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6301867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6302263Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6302656Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6303053Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6303443Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6303669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6304137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6304463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6304689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6305364Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6305480Z warnings.warn( 2022-11-23T03:30:20.6306132Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6306241Z warnings.warn( 2022-11-23T03:30:20.6306893Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6307003Z warnings.warn( 2022-11-23T03:30:20.6307653Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6307828Z warnings.warn( 2022-11-23T03:30:20.6307921Z dist init r=2, world=4 2022-11-23T03:30:20.6308030Z dist init r=3, world=4 2022-11-23T03:30:20.6308137Z dist init r=1, world=4 2022-11-23T03:30:20.6308242Z dist init r=0, world=4 2022-11-23T03:30:20.6308341Z ok (5.320s) 2022-11-23T03:30:20.6308647Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6309094Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66304 2022-11-23T03:30:20.6309314Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66305 2022-11-23T03:30:20.6309553Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66306 2022-11-23T03:30:20.6309779Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66307 2022-11-23T03:30:20.6310160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6310336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6310711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6310903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6311268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6311441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6311799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6311991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6312360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6312530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6312904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6313093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6313507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6313686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6314067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6314236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6314480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6314724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6314955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6315183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6315578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6315977Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6316370Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6316761Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6317058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6317286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6317504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6317723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6318475Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6318631Z warnings.warn( 2022-11-23T03:30:20.6319321Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6319477Z warnings.warn( 2022-11-23T03:30:20.6320148Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6320487Z warnings.warn( 2022-11-23T03:30:20.6321184Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6321334Z warnings.warn( 2022-11-23T03:30:20.6321482Z dist init r=2, world=4 2022-11-23T03:30:20.6321575Z dist init r=0, world=4 2022-11-23T03:30:20.6321721Z dist init r=1, world=4 2022-11-23T03:30:20.6321869Z dist init r=3, world=4 2022-11-23T03:30:20.6322015Z ok (5.220s) 2022-11-23T03:30:20.6322355Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6322890Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66605 2022-11-23T03:30:20.6323152Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66606 2022-11-23T03:30:20.6323504Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66607 2022-11-23T03:30:20.6323702Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66608 2022-11-23T03:30:20.6324117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6348502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6349036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6349251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6349668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6349848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6350356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6350549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6350914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6351086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6351439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6351818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6352197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6352370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6352738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6352927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6353175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6353420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6353659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6353881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6354341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6354725Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6355112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6355494Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6355723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6355948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6356178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6356393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6357050Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6357146Z warnings.warn( 2022-11-23T03:30:20.6357863Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6357986Z warnings.warn( 2022-11-23T03:30:20.6358638Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6358750Z warnings.warn( 2022-11-23T03:30:20.6359387Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6359495Z warnings.warn( 2022-11-23T03:30:20.6359606Z dist init r=1, world=4 2022-11-23T03:30:20.6359717Z dist init r=0, world=4 2022-11-23T03:30:20.6359808Z dist init r=3, world=4 2022-11-23T03:30:20.6359916Z dist init r=2, world=4 2022-11-23T03:30:20.6360016Z ok (5.320s) 2022-11-23T03:30:20.6360232Z test_scatter_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6360682Z Tests :meth:`scatter_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66906 2022-11-23T03:30:20.6360959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66907 2022-11-23T03:30:20.6361173Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66908 2022-11-23T03:30:20.6361370Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66909 2022-11-23T03:30:20.6361741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6361918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6362295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6362486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6362850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6363024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6363395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6363583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6363926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6364094Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6364463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6364649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6365012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6365183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6365551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6365734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6365963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6366245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6366536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6366780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6367183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6367574Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6367968Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6368370Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6368600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6368831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6369039Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6369264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6369506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:30:20.6369854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:30:20.6370093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:30:20.6370331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:30:20.6370731Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6371125Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6371508Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6371875Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6372109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:30:20.6372349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:30:20.6372585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:30:20.6372822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:30:20.6373216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6373603Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6373987Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6374374Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6374473Z dist init r=3, world=4 2022-11-23T03:30:20.6374587Z dist init r=1, world=4 2022-11-23T03:30:20.6374700Z dist init r=2, world=4 2022-11-23T03:30:20.6374814Z dist init r=0, world=4 2022-11-23T03:30:20.6374920Z ok (5.921s) 2022-11-23T03:30:20.6375152Z test_shard_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6375608Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67225 2022-11-23T03:30:20.6375862Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67226 2022-11-23T03:30:20.6376089Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67227 2022-11-23T03:30:20.6376301Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67228 2022-11-23T03:30:20.6376677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6376859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6377235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6377426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6377785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6377958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6378311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6378495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6378861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6379087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6379531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6379722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6380084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6380258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6380610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6380804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6381055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6381301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6381545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6381779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6382176Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6382569Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6382956Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6383345Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6383556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6383785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6384314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6384553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6384788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:30:20.6385110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:30:20.6385359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:30:20.6385602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:30:20.6385988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6386382Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6386766Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6387152Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6387815Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6387927Z warnings.warn( 2022-11-23T03:30:20.6388570Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6388791Z warnings.warn( 2022-11-23T03:30:20.6389506Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6389614Z warnings.warn( 2022-11-23T03:30:20.6390253Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6390344Z warnings.warn( 2022-11-23T03:30:20.6390584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:30:20.6390829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:30:20.6391072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:30:20.6391311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:30:20.6391709Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6392106Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6392496Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6392877Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6392971Z dist init r=1, world=4 2022-11-23T03:30:20.6393081Z dist init r=2, world=4 2022-11-23T03:30:20.6393190Z dist init r=0, world=4 2022-11-23T03:30:20.6393297Z dist init r=3, world=4 2022-11-23T03:30:20.6393398Z ok (5.520s) 2022-11-23T03:30:20.6393704Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6394155Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67544 2022-11-23T03:30:20.6394429Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67545 2022-11-23T03:30:20.6394640Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67546 2022-11-23T03:30:20.6394851Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67547 2022-11-23T03:30:20.6395224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6395403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6395783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6395971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6396333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6396504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6396861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6397047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6397410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6397579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6398005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6398189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6398546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6398720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6399087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6399253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6399498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6399746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6399992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6400230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6400630Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6401023Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6401407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6401795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6402006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6402236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6402460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6402685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6403344Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6403504Z warnings.warn( 2022-11-23T03:30:20.6404163Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6404272Z warnings.warn( 2022-11-23T03:30:20.6404922Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6405030Z warnings.warn( 2022-11-23T03:30:20.6405643Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6405757Z warnings.warn( 2022-11-23T03:30:20.6405868Z dist init r=2, world=4 2022-11-23T03:30:20.6405975Z dist init r=3, world=4 2022-11-23T03:30:20.6406082Z dist init r=0, world=4 2022-11-23T03:30:20.6406190Z dist init r=1, world=4 2022-11-23T03:30:20.6406290Z ok (5.321s) 2022-11-23T03:30:20.6406576Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6407103Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67845 2022-11-23T03:30:20.6407320Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67846 2022-11-23T03:30:20.6407536Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67847 2022-11-23T03:30:20.6407751Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67848 2022-11-23T03:30:20.6408121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6408297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6408675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6408865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6409212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6409382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6409833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6410028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6410392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6410564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6410937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6411203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6411581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6411736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6412112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6412299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6412544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6412838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6413088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6413322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6413719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6414098Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6414490Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6414881Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6415112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6415333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6415548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6415770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6416483Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6416599Z warnings.warn( 2022-11-23T03:30:20.6417252Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6417345Z warnings.warn( 2022-11-23T03:30:20.6417987Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6418104Z warnings.warn( 2022-11-23T03:30:20.6418755Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6418868Z warnings.warn( 2022-11-23T03:30:20.6418986Z dist init r=3, world=4 2022-11-23T03:30:20.6419102Z dist init r=0, world=4 2022-11-23T03:30:20.6419214Z dist init r=1, world=4 2022-11-23T03:30:20.6419304Z dist init r=2, world=4 2022-11-23T03:30:20.6419408Z ok (5.320s) 2022-11-23T03:30:20.6419722Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6420175Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68146 2022-11-23T03:30:20.6420396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68147 2022-11-23T03:30:20.6420626Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68148 2022-11-23T03:30:20.6420862Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68149 2022-11-23T03:30:20.6421232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6421408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6421815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6422021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6422390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6422567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6422950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6423144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6423503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6423685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6424267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6424546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6424931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6425101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6425468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6425776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6426020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6426267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6426507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6426727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6427135Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6427530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6427919Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6428310Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6428539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6428767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6428995Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6429220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6429860Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6429978Z warnings.warn( 2022-11-23T03:30:20.6430618Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6430726Z warnings.warn( 2022-11-23T03:30:20.6431436Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6431551Z warnings.warn( 2022-11-23T03:30:20.6432192Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6432303Z warnings.warn( 2022-11-23T03:30:20.6432415Z dist init r=3, world=4 2022-11-23T03:30:20.6432523Z dist init r=1, world=4 2022-11-23T03:30:20.6432613Z dist init r=0, world=4 2022-11-23T03:30:20.6432719Z dist init r=2, world=4 2022-11-23T03:30:20.6432818Z ok (5.220s) 2022-11-23T03:30:20.6433120Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6433568Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68447 2022-11-23T03:30:20.6433786Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68448 2022-11-23T03:30:20.6434000Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68449 2022-11-23T03:30:20.6434214Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68450 2022-11-23T03:30:20.6434623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6434797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6435172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6435363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6435720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6435897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6436268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6436458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6436798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6436973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6437340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6437526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6437891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6438064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6438428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6438611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6438857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6439086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6439327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6439565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6439962Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6440410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6440813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6441204Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6441435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6441661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6441870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6442089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6442748Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6442860Z warnings.warn( 2022-11-23T03:30:20.6443499Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6443659Z warnings.warn( 2022-11-23T03:30:20.6444306Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6444412Z warnings.warn( 2022-11-23T03:30:20.6445045Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6445154Z warnings.warn( 2022-11-23T03:30:20.6445246Z dist init r=0, world=4 2022-11-23T03:30:20.6445354Z dist init r=2, world=4 2022-11-23T03:30:20.6445460Z dist init r=1, world=4 2022-11-23T03:30:20.6445569Z dist init r=3, world=4 2022-11-23T03:30:20.6445670Z ok (5.220s) 2022-11-23T03:30:20.6445977Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6446426Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68748 2022-11-23T03:30:20.6446624Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68749 2022-11-23T03:30:20.6446839Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68750 2022-11-23T03:30:20.6447056Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68751 2022-11-23T03:30:20.6447427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6447602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6447975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6448170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6448531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6448702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6449052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6449290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6449662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6449833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6450199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6450392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6450758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6450927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6451270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6451456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6451703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6451946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6452186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6452420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6452880Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6453270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6453659Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6454047Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6454258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6454483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6454705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6454928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6455583Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6455695Z warnings.warn( 2022-11-23T03:30:20.6456340Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6456447Z warnings.warn( 2022-11-23T03:30:20.6457088Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6457197Z warnings.warn( 2022-11-23T03:30:20.6457820Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6457928Z warnings.warn( 2022-11-23T03:30:20.6458038Z dist init r=0, world=4 2022-11-23T03:30:20.6458144Z dist init r=2, world=4 2022-11-23T03:30:20.6458337Z dist init r=1, world=4 2022-11-23T03:30:20.6458441Z dist init r=3, world=4 2022-11-23T03:30:20.6458540Z ok (5.320s) 2022-11-23T03:30:20.6458825Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6459271Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69049 2022-11-23T03:30:20.6459494Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69050 2022-11-23T03:30:20.6459708Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69051 2022-11-23T03:30:20.6459920Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69052 2022-11-23T03:30:20.6460291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6460468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6460847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6461039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6461388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6461615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6461988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6462176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6462535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6462705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6463076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6463263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6463605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6463782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6464391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6464580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6464824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6465065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6465309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6465550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6465944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6466321Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6466716Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6467101Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6467327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6467625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6467859Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6468075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6468729Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6468843Z warnings.warn( 2022-11-23T03:30:20.6469482Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6469573Z warnings.warn( 2022-11-23T03:30:20.6470215Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6470325Z warnings.warn( 2022-11-23T03:30:20.6470959Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6471151Z warnings.warn( 2022-11-23T03:30:20.6471262Z dist init r=0, world=4 2022-11-23T03:30:20.6471370Z dist init r=3, world=4 2022-11-23T03:30:20.6471476Z dist init r=2, world=4 2022-11-23T03:30:20.6471565Z dist init r=1, world=4 2022-11-23T03:30:20.6471664Z ok (5.320s) 2022-11-23T03:30:20.6471967Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6472417Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69350 2022-11-23T03:30:20.6472635Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69351 2022-11-23T03:30:20.6472850Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69352 2022-11-23T03:30:20.6473066Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69353 2022-11-23T03:30:20.6473433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6473666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6474041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6474233Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6474585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6474756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6475125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6475314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6475667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6475840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6476195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6476374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6476794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6476976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6477337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6477522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6477771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6478001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6478238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6478453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6478846Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6479237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6479682Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6480067Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6480346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6480561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6480784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6480990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6481639Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6481751Z warnings.warn( 2022-11-23T03:30:20.6482388Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6482501Z warnings.warn( 2022-11-23T03:30:20.6483147Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6483246Z warnings.warn( 2022-11-23T03:30:20.6483880Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6483979Z warnings.warn( 2022-11-23T03:30:20.6484087Z dist init r=2, world=4 2022-11-23T03:30:20.6484178Z dist init r=1, world=4 2022-11-23T03:30:20.6484280Z dist init r=3, world=4 2022-11-23T03:30:20.6484387Z dist init r=0, world=4 2022-11-23T03:30:20.6484476Z ok (5.220s) 2022-11-23T03:30:20.6484775Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6485223Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69651 2022-11-23T03:30:20.6485431Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69652 2022-11-23T03:30:20.6485691Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69653 2022-11-23T03:30:20.6485895Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69654 2022-11-23T03:30:20.6486255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6486429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6486800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6486989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6487347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6487509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6487870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6488057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6488395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6488556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6488979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6489156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6489521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6489683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6490046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6490232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6490459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6490692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6490933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6491172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6491560Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6491950Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6492332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6492717Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6492933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6493160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6493370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6493577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6494235Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6494382Z warnings.warn( 2022-11-23T03:30:20.6495025Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6495137Z warnings.warn( 2022-11-23T03:30:20.6495776Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6495889Z warnings.warn( 2022-11-23T03:30:20.6496524Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6496628Z warnings.warn( 2022-11-23T03:30:20.6496721Z dist init r=1, world=4 2022-11-23T03:30:20.6496826Z dist init r=0, world=4 2022-11-23T03:30:20.6496924Z dist init r=2, world=4 2022-11-23T03:30:20.6497028Z dist init r=3, world=4 2022-11-23T03:30:20.6497116Z ok (5.220s) 2022-11-23T03:30:20.6497326Z test_shard_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6497742Z Tests :meth:`shard_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69952 2022-11-23T03:30:20.6497999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69953 2022-11-23T03:30:20.6498207Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69954 2022-11-23T03:30:20.6498412Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69955 2022-11-23T03:30:20.6498773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6498951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6499320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6499508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6499856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6500013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6500386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6500575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6500924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6501091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6501458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6501637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6502002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6502166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6502513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6502698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6502932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6503174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6503454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6503687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6504295Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6504704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6505092Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6505465Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6505690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6505920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6506145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6506353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6506591Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:30:20.6506914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:30:20.6507148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:30:20.6507376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:30:20.6507756Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6508147Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6508530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6508911Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:30:20.6509139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:30:20.6509376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:30:20.6509612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:30:20.6509886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:30:20.6510279Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6510652Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6511042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6511418Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:30:20.6511528Z dist init r=3, world=4 2022-11-23T03:30:20.6511627Z dist init r=1, world=4 2022-11-23T03:30:20.6511734Z dist init r=0, world=4 2022-11-23T03:30:20.6511839Z dist init r=2, world=4 2022-11-23T03:30:20.6511930Z ok (6.021s) 2022-11-23T03:30:20.6512234Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6512614Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70271 2022-11-23T03:30:20.6512836Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70272 2022-11-23T03:30:20.6513051Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70273 2022-11-23T03:30:20.6513249Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70274 2022-11-23T03:30:20.6513628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6513794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6514170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6514344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6514703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6514873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6515243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6515427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6515846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6516010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6516373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6516535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6516894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6517083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6517452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6517629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6517871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6518111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6518355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6518588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6518976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6519357Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6519738Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6520129Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6520352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6520579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6520797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6521009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6521714Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6521828Z warnings.warn( 2022-11-23T03:30:20.6522463Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6522580Z warnings.warn( 2022-11-23T03:30:20.6523211Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6523319Z warnings.warn( 2022-11-23T03:30:20.6523966Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6524067Z warnings.warn( 2022-11-23T03:30:20.6524177Z dist init r=1, world=4 2022-11-23T03:30:20.6524278Z dist init r=2, world=4 2022-11-23T03:30:20.6524384Z dist init r=0, world=4 2022-11-23T03:30:20.6524528Z dist init r=3, world=4 2022-11-23T03:30:20.6524618Z ok (5.019s) 2022-11-23T03:30:20.6524938Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6525248Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70572 2022-11-23T03:30:20.6525458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70573 2022-11-23T03:30:20.6525675Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70574 2022-11-23T03:30:20.6525881Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70575 2022-11-23T03:30:20.6526256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6526413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6526787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6526976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6527326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6527498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6527869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6528058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6528414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6528579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6528931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6529123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6529480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6529644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6530010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6530236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6530486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6530721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6530944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6531183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6531589Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6531974Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6532359Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6532747Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6532973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6533194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6533484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6533702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6534341Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6534445Z warnings.warn( 2022-11-23T03:30:20.6535085Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6535188Z warnings.warn( 2022-11-23T03:30:20.6535838Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6535943Z warnings.warn( 2022-11-23T03:30:20.6536576Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6536680Z warnings.warn( 2022-11-23T03:30:20.6536794Z dist init r=2, world=4 2022-11-23T03:30:20.6536885Z dist init r=0, world=4 2022-11-23T03:30:20.6536991Z dist init r=1, world=4 2022-11-23T03:30:20.6537098Z dist init r=3, world=4 2022-11-23T03:30:20.6537199Z ok (5.019s) 2022-11-23T03:30:20.6537520Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6537831Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70873 2022-11-23T03:30:20.6538049Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70874 2022-11-23T03:30:20.6538266Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70875 2022-11-23T03:30:20.6538461Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70876 2022-11-23T03:30:20.6538882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6539067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6539447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6539639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6540000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6540176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6540547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6540734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6541076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6541250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6541615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6541802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6542159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6542383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6542760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6542948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6543175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6543422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6543660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6544186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6544604Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6545001Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6545395Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6545785Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6546018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6546228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6546451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6546671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6547294Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6547410Z warnings.warn( 2022-11-23T03:30:20.6548022Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6548131Z warnings.warn( 2022-11-23T03:30:20.6548940Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6549063Z warnings.warn( 2022-11-23T03:30:20.6549684Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6549779Z warnings.warn( 2022-11-23T03:30:20.6550426Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6550534Z warnings.warn( 2022-11-23T03:30:20.6551168Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6551275Z warnings.warn( 2022-11-23T03:30:20.6551926Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6552106Z warnings.warn( 2022-11-23T03:30:20.6552743Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6552850Z warnings.warn( 2022-11-23T03:30:20.6552961Z dist init r=0, world=4 2022-11-23T03:30:20.6553052Z dist init r=2, world=4 2022-11-23T03:30:20.6553162Z dist init r=1, world=4 2022-11-23T03:30:20.6553275Z dist init r=3, world=4 2022-11-23T03:30:20.6553373Z ok (4.919s) 2022-11-23T03:30:20.6553694Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6554002Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71174 2022-11-23T03:30:20.6554221Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71175 2022-11-23T03:30:20.6554437Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71176 2022-11-23T03:30:20.6554636Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71177 2022-11-23T03:30:20.6555009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6555186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6555564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6555757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6556120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6556295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6556666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6556837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6557198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6557421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6557805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6557991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6558354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6558531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6558896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6559082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6559308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6559551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6559794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6560029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6560429Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6560887Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6561277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6561664Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6561891Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6562105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6562329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6562548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6563169Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6563285Z warnings.warn( 2022-11-23T03:30:20.6563902Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6564013Z warnings.warn( 2022-11-23T03:30:20.6564629Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6564738Z warnings.warn( 2022-11-23T03:30:20.6565346Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:30:20.6565441Z warnings.warn( 2022-11-23T03:30:20.6566089Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6566200Z warnings.warn( 2022-11-23T03:30:20.6566885Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6567000Z warnings.warn( 2022-11-23T03:30:20.6567649Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6567760Z warnings.warn( 2022-11-23T03:30:20.6568393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T03:30:20.6568490Z warnings.warn( 2022-11-23T03:30:20.6568607Z dist init r=1, world=4 2022-11-23T03:30:20.6568719Z dist init r=0, world=4 2022-11-23T03:30:20.6568823Z dist init r=2, world=4 2022-11-23T03:30:20.6568932Z dist init r=3, world=4 2022-11-23T03:30:20.6569014Z ok (5.019s) 2022-11-23T03:30:20.6569197Z test_use_orig_params_error (__main__.TestFSDPOptimState) 2022-11-23T03:30:20.6569509Z Tests that the optimizer state checkpointing APIs raise an error ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71475 2022-11-23T03:30:20.6569726Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71476 2022-11-23T03:30:20.6569993Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71477 2022-11-23T03:30:20.6570204Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71478 2022-11-23T03:30:20.6570578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6570752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6571164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6571350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6571712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6571882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6572266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6572444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6572804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6572972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6573339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6573510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6573874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:20.6574043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:20.6574408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:20.6574597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:20.6574843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:20.6575087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:20.6575422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:20.6575694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:20.6576111Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6576505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6576903Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6577292Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:20.6577519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:20.6577743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:20.6577971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:20.6578190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:20.6578284Z dist init r=0, world=4 2022-11-23T03:30:20.6578390Z dist init r=2, world=4 2022-11-23T03:30:20.6578498Z dist init r=1, world=4 2022-11-23T03:30:20.6578604Z dist init r=3, world=4 2022-11-23T03:30:20.6578703Z ok (4.920s) 2022-11-23T03:30:20.6578772Z 2022-11-23T03:30:20.6579047Z ---------------------------------------------------------------------- 2022-11-23T03:30:20.6579163Z Ran 53 tests in 274.572s 2022-11-23T03:30:20.6579183Z 2022-11-23T03:30:20.6579275Z OK 2022-11-23T03:30:20.6579294Z 2022-11-23T03:30:20.6579399Z Generating XML reports... 2022-11-23T03:30:20.6579919Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20221123032545.xml 2022-11-23T03:30:20.6579940Z 2022-11-23T03:30:20.6580501Z ##[endgroup] 2022-11-23T03:30:20.6580994Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_optim_state_ulxfapac) 2022-11-23T03:30:20.6581015Z 2022-11-23T03:30:20.9520326Z 2022-11-23T03:30:20.9521302Z real 4m42.544s 2022-11-23T03:30:20.9521607Z user 15m43.768s 2022-11-23T03:30:20.9521860Z sys 9m46.939s 2022-11-23T03:30:20.9522152Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:30:20.9522754Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_overlap.py 2022-11-23T03:30:23.3341909Z Ignoring disabled issues: [] 2022-11-23T03:30:23.3877775Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:30:23.3878371Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:30:23.3878728Z Selected tests: 2022-11-23T03:30:23.3878998Z distributed/fsdp/test_fsdp_overlap.py 2022-11-23T03:30:23.3907072Z Prioritized test from test file changes. 2022-11-23T03:30:23.3907523Z reordering tests for PR: 2022-11-23T03:30:23.3907786Z prioritized: [] 2022-11-23T03:30:23.3908320Z the rest: ['distributed/fsdp/test_fsdp_overlap.py'] 2022-11-23T03:30:23.3908547Z 2022-11-23T03:30:23.3909117Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:30:23.3909996Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:30:23.3916352Z parallel (file granularity) tests: 2022-11-23T03:30:23.3916726Z 2022-11-23T03:30:23.3917004Z serial (file granularity) tests: 2022-11-23T03:30:23.3917317Z distributed/fsdp/test_fsdp_overlap.py 2022-11-23T03:30:25.6714349Z Ignoring disabled issues: [] 2022-11-23T03:30:26.0932797Z Running distributed/fsdp/test_fsdp_overlap.py ... [2022-11-23 03:30:26.092711] 2022-11-23T03:30:26.0934279Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_overlap.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:30:26.093202] 2022-11-23T03:30:44.3566016Z 2022-11-23T03:30:44.3566603Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_overlap 2022-11-23T03:30:44.3567565Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_overlap_qyn8dgye) 2022-11-23T03:30:44.3572153Z 2022-11-23T03:30:44.3572489Z Running tests... 2022-11-23T03:30:44.3573062Z ---------------------------------------------------------------------- 2022-11-23T03:30:44.3573641Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap 2022-11-23T03:30:44.3574191Z test_forward_overlap (__main__.TestForwardOverlapWorldSizeOne) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:30:44.3574718Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71988 2022-11-23T03:30:44.3575355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:44.3575811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:44.3576389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:44.3577202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:44.3577658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:44.3578322Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:30:44.3578848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:44.3580100Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:30:44.3581047Z warnings.warn( 2022-11-23T03:30:44.3581284Z dist init r=0, world=1 2022-11-23T03:30:44.3582014Z 2022-11-23T03:30:44.3582118Z rank0: 2022-11-23T03:30:44.3582632Z e1: {'cpu_iter': 0.0019954503000001013, 'cpu_wait': 4.067869999992979e-05, 'gpu_compute': 0.06590399984270334, 'gpu_total': 0.7830143988132476} 2022-11-23T03:30:44.3583199Z e2: {'cpu_iter': 0.0060055160000000996, 'cpu_wait': 3.639430000017541e-05, 'gpu_compute': 0.26210560016334056, 'gpu_total': 2.4730111837387083} 2022-11-23T03:30:44.3583783Z e3: {'cpu_iter': 0.002039964100000091, 'cpu_wait': 0.22797826339999983, 'gpu_compute': 230.48209190368652, 'gpu_total': 230.7711669921875} 2022-11-23T03:30:44.3584856Z e4: {'cpu_iter': 0.0060562373000003335, 'cpu_wait': 0.2250649770000006, 'gpu_compute': 230.47109298706056, 'gpu_total': 231.06353912353515} 2022-11-23T03:30:44.3585184Z ok (15.865s) 2022-11-23T03:30:44.3586209Z test_forward_overlap (__main__.TestForwardOverlapWorldSizeTwo) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/71183 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T03:30:44.3586868Z 2022-11-23T03:30:44.3587126Z ---------------------------------------------------------------------- 2022-11-23T03:30:44.3587462Z Ran 2 tests in 15.866s 2022-11-23T03:30:44.3587626Z 2022-11-23T03:30:44.3587737Z OK (skipped=1) 2022-11-23T03:30:44.3587892Z 2022-11-23T03:30:44.3588016Z Generating XML reports... 2022-11-23T03:30:44.3588782Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeOne-20221123033028.xml 2022-11-23T03:30:44.3589666Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeTwo-20221123033028.xml 2022-11-23T03:30:44.3590050Z 2022-11-23T03:30:44.3590443Z ##[endgroup] 2022-11-23T03:30:44.3591038Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_overlap_qyn8dgye) 2022-11-23T03:30:44.3591399Z 2022-11-23T03:30:44.6928290Z 2022-11-23T03:30:44.6928563Z real 0m23.741s 2022-11-23T03:30:44.6928815Z user 0m27.465s 2022-11-23T03:30:44.6929050Z sys 0m20.998s 2022-11-23T03:30:44.6929331Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:30:44.6929836Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_pure_fp16.py 2022-11-23T03:30:47.0420847Z Ignoring disabled issues: [] 2022-11-23T03:30:47.0953431Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:30:47.0953993Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:30:47.0954322Z Selected tests: 2022-11-23T03:30:47.0954608Z distributed/fsdp/test_fsdp_pure_fp16.py 2022-11-23T03:30:47.0984467Z Prioritized test from test file changes. 2022-11-23T03:30:47.0984834Z reordering tests for PR: 2022-11-23T03:30:47.0985142Z prioritized: [] 2022-11-23T03:30:47.0985730Z the rest: ['distributed/fsdp/test_fsdp_pure_fp16.py'] 2022-11-23T03:30:47.0985979Z 2022-11-23T03:30:47.0986534Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:30:47.0987460Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:30:47.0992699Z parallel (file granularity) tests: 2022-11-23T03:30:47.0992990Z 2022-11-23T03:30:47.0993245Z serial (file granularity) tests: 2022-11-23T03:30:47.0993541Z distributed/fsdp/test_fsdp_pure_fp16.py 2022-11-23T03:30:49.4116872Z Ignoring disabled issues: [] 2022-11-23T03:30:49.8283881Z Running distributed/fsdp/test_fsdp_pure_fp16.py ... [2022-11-23 03:30:49.827678] 2022-11-23T03:30:49.8285108Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:30:49.828155] 2022-11-23T03:30:58.7989062Z 2022-11-23T03:30:58.7990002Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T03:30:58.7991782Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_cyq8uofk) 2022-11-23T03:30:58.7992456Z 2022-11-23T03:30:58.7992655Z Running tests... 2022-11-23T03:30:58.7993571Z ---------------------------------------------------------------------- 2022-11-23T03:30:58.7994706Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16 2022-11-23T03:30:58.7995636Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=False) (__main__.TestPureFP16) 2022-11-23T03:30:58.7996639Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:30:58.7998593Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/73315 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.685s) 2022-11-23T03:30:58.7999836Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=True) (__main__.TestPureFP16) 2022-11-23T03:30:58.8001708Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72276 2022-11-23T03:30:58.8002869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72277 2022-11-23T03:30:58.8003781Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 72278 2022-11-23T03:30:58.8004712Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 72279 2022-11-23T03:30:58.8006026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:58.8006954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:58.8008172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:58.8009162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:58.8010401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:58.8011304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:58.8012504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:58.8013435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:58.8014956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:58.8016120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:58.8017349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:58.8018333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:58.8019574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:30:58.8020503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:30:58.8021689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:30:58.8022669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:30:58.8023608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:30:58.8025037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:30:58.8026105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:30:58.8027174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:30:58.8028603Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:58.8030066Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:58.8031555Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:58.8033039Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:30:58.8034174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:30:58.8035147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:30:58.8036142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:30:58.8037149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:30:58.8040056Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:30:58.8041786Z warnings.warn( 2022-11-23T03:30:58.8044338Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:30:58.8046020Z warnings.warn( 2022-11-23T03:30:58.8048613Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:30:58.8050459Z warnings.warn( 2022-11-23T03:30:58.8051069Z File "", line 1, in 2022-11-23T03:30:58.8051620Z File "", line 1, in 2022-11-23T03:30:58.8052375Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:30:58.8053150Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:30:58.8053886Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:30:58.8054662Z return self._bootstrap(parent_sentinel) 2022-11-23T03:30:58.8055457Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:30:58.8056207Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:30:58.8056994Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:30:58.8057696Z self.run() 2022-11-23T03:30:58.8058374Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:30:58.8059131Z return self._bootstrap(parent_sentinel) 2022-11-23T03:30:58.8059918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:30:58.8060682Z self._target(*self._args, **self._kwargs) 2022-11-23T03:30:58.8061439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:30:58.8062142Z self.run() 2022-11-23T03:30:58.8063157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:30:58.8064193Z self.run_test(test_name, pipe) 2022-11-23T03:30:58.8064974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:30:58.8065739Z self._target(*self._args, **self._kwargs) 2022-11-23T03:30:58.8066883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:30:58.8067672Z getattr(self, test_name)() 2022-11-23T03:30:58.8068701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:30:58.8069496Z self.run_test(test_name, pipe) 2022-11-23T03:30:58.8070572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:30:58.8071341Z fn() 2022-11-23T03:30:58.8072363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:30:58.8073171Z getattr(self, test_name)() 2022-11-23T03:30:58.8074426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:30:58.8075339Z test(self, **param_kwargs) 2022-11-23T03:30:58.8076415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:30:58.8077160Z fn() 2022-11-23T03:30:58.8078174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:30:58.8078998Z return func(*args, **kwargs) 2022-11-23T03:30:58.8080068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:30:58.8080874Z test(self, **param_kwargs) 2022-11-23T03:30:58.8081673Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T03:30:58.8082441Z self._test_fsdp_parity( 2022-11-23T03:30:58.8083507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:30:58.8084318Z return func(*args, **kwargs) 2022-11-23T03:30:58.8085410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:30:58.8086249Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:30:58.8087255Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T03:30:58.8088024Z self._test_fsdp_parity( 2022-11-23T03:30:58.8089118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:30:58.8089937Z output = model(*input) 2022-11-23T03:30:58.8091023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:30:58.8091885Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:30:58.8092922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:30:58.8093774Z return forward_call(*input, **kwargs) 2022-11-23T03:30:58.8094925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:30:58.8095715Z output = model(*input) 2022-11-23T03:30:58.8096836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:30:58.8097771Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:30:58.8098950Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:30:58.8099715Z return forward_call(*input, **kwargs) 2022-11-23T03:30:58.8100834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:30:58.8101659Z _lazy_init(state, module) 2022-11-23T03:30:58.8102750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:30:58.8103681Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:30:58.8105150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:30:58.8105992Z handle.init_flat_param_attributes() 2022-11-23T03:30:58.8107097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:30:58.8107899Z _lazy_init(state, module) 2022-11-23T03:30:58.8108929Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:30:58.8109690Z return func(*args, **kwargs) 2022-11-23T03:30:58.8110884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:30:58.8111743Z handle.init_flat_param_attributes() 2022-11-23T03:30:58.8112898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:30:58.8113670Z p_assert( 2022-11-23T03:30:58.8114669Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:30:58.8115478Z return func(*args, **kwargs) 2022-11-23T03:30:58.8116533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:30:58.8117372Z traceback.print_stack() 2022-11-23T03:30:58.8118544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:30:58.8119381Z p_assert( 2022-11-23T03:30:58.8120507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:30:58.8121309Z traceback.print_stack() 2022-11-23T03:30:58.8123905Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:30:58.8125762Z warnings.warn( 2022-11-23T03:30:58.8126287Z File "", line 1, in 2022-11-23T03:30:58.8127038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:30:58.8127950Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:30:58.8128650Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:30:58.8129195Z return self._bootstrap(parent_sentinel) 2022-11-23T03:30:58.8129774Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:30:58.8130324Z self.run() 2022-11-23T03:30:58.8130836Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:30:58.8131403Z self._target(*self._args, **self._kwargs) 2022-11-23T03:30:58.8132471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:30:58.8133102Z self.run_test(test_name, pipe) 2022-11-23T03:30:58.8133966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:30:58.8134621Z getattr(self, test_name)() 2022-11-23T03:30:58.8135582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:30:58.8136139Z fn() 2022-11-23T03:30:58.8136927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:30:58.8137557Z test(self, **param_kwargs) 2022-11-23T03:30:58.8138354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:30:58.8138954Z return func(*args, **kwargs) 2022-11-23T03:30:58.8139567Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T03:30:58.8140288Z self._test_fsdp_parity( 2022-11-23T03:30:58.8141159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:30:58.8141835Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:30:58.8142783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:30:58.8143585Z output = model(*input) 2022-11-23T03:30:58.8144954Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:30:58.8145618Z return forward_call(*input, **kwargs) 2022-11-23T03:30:58.8146522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:30:58.8147240Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:30:58.8148302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:30:58.8148884Z _lazy_init(state, module) 2022-11-23T03:30:58.8149891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:30:58.8150523Z handle.init_flat_param_attributes() 2022-11-23T03:30:58.8151350Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:30:58.8151945Z return func(*args, **kwargs) 2022-11-23T03:30:58.8152838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:30:58.8153462Z p_assert( 2022-11-23T03:30:58.8154366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:30:58.8154936Z traceback.print_stack() 2022-11-23T03:30:58.8155528Z File "", line 1, in 2022-11-23T03:30:58.8156092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:30:58.8156799Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:30:58.8157383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:30:58.8157985Z return self._bootstrap(parent_sentinel) 2022-11-23T03:30:58.8158560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:30:58.8159085Z self.run() 2022-11-23T03:30:58.8159615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:30:58.8160181Z self._target(*self._args, **self._kwargs) 2022-11-23T03:30:58.8161019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:30:58.8161653Z self.run_test(test_name, pipe) 2022-11-23T03:30:58.8162543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:30:58.8163173Z getattr(self, test_name)() 2022-11-23T03:30:58.8164024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:30:58.8164621Z fn() 2022-11-23T03:30:58.8165430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:30:58.8166082Z test(self, **param_kwargs) 2022-11-23T03:30:58.8166933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:30:58.8167560Z return func(*args, **kwargs) 2022-11-23T03:30:58.8168262Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T03:30:58.8168889Z self._test_fsdp_parity( 2022-11-23T03:30:58.8169794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:30:58.8170500Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:30:58.8171453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:30:58.8172097Z output = model(*input) 2022-11-23T03:30:58.8172844Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:30:58.8173458Z return forward_call(*input, **kwargs) 2022-11-23T03:30:58.8174450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:30:58.8175495Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:30:58.8176418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:30:58.8177044Z _lazy_init(state, module) 2022-11-23T03:30:58.8177893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:30:58.8178529Z handle.init_flat_param_attributes() 2022-11-23T03:30:58.8179394Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:30:58.8180155Z return func(*args, **kwargs) 2022-11-23T03:30:58.8181176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:30:58.8181767Z p_assert( 2022-11-23T03:30:58.8182543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:30:58.8183218Z traceback.print_stack() 2022-11-23T03:30:58.8183645Z dist init r=3, world=4 2022-11-23T03:30:58.8184755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:30:58.8185664Z dist init r=0, world=4 2022-11-23T03:30:58.8186440Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:30:58.8187149Z dist init r=1, world=4 2022-11-23T03:30:58.8187906Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:30:58.8188636Z dist init r=2, world=4 2022-11-23T03:30:58.8189380Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:30:58.8190075Z ok (4.921s) 2022-11-23T03:30:58.8190314Z 2022-11-23T03:30:58.8190777Z ---------------------------------------------------------------------- 2022-11-23T03:30:58.8191291Z Ran 2 tests in 6.606s 2022-11-23T03:30:58.8191677Z 2022-11-23T03:30:58.8191837Z OK (skipped=1) 2022-11-23T03:30:58.8192069Z 2022-11-23T03:30:58.8192255Z Generating XML reports... 2022-11-23T03:30:58.8193158Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123033051.xml 2022-11-23T03:30:58.8193711Z 2022-11-23T03:30:58.8194204Z ##[endgroup] 2022-11-23T03:30:58.8195173Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_cyq8uofk) 2022-11-23T03:30:58.8196003Z 2022-11-23T03:30:59.2057223Z 2022-11-23T03:30:59.2057625Z real 0m14.513s 2022-11-23T03:30:59.2057944Z user 0m31.700s 2022-11-23T03:30:59.2058179Z sys 0m25.057s 2022-11-23T03:30:59.2058461Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:30:59.2059086Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_sharded_grad_scaler.py 2022-11-23T03:31:01.5939931Z Ignoring disabled issues: [] 2022-11-23T03:31:01.6461898Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:31:01.6462588Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:31:01.6462932Z Selected tests: 2022-11-23T03:31:01.6463241Z distributed/fsdp/test_fsdp_sharded_grad_scaler.py 2022-11-23T03:31:01.6488244Z Prioritized test from test file changes. 2022-11-23T03:31:01.6488597Z reordering tests for PR: 2022-11-23T03:31:01.6488865Z prioritized: [] 2022-11-23T03:31:01.6489367Z the rest: ['distributed/fsdp/test_fsdp_sharded_grad_scaler.py'] 2022-11-23T03:31:01.6489830Z 2022-11-23T03:31:01.6490382Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:31:01.6491316Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:31:01.6496882Z parallel (file granularity) tests: 2022-11-23T03:31:01.6497132Z 2022-11-23T03:31:01.6498071Z serial (file granularity) tests: 2022-11-23T03:31:01.6498789Z distributed/fsdp/test_fsdp_sharded_grad_scaler.py 2022-11-23T03:31:03.9601691Z Ignoring disabled issues: [] 2022-11-23T03:31:04.3673795Z Running distributed/fsdp/test_fsdp_sharded_grad_scaler.py ... [2022-11-23 03:31:04.366825] 2022-11-23T03:31:04.3675314Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_sharded_grad_scaler.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:31:04.367330] 2022-11-23T03:31:48.7933383Z 2022-11-23T03:31:48.7934099Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_sharded_grad_scaler 2022-11-23T03:31:48.7937948Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_sharded_grad_scaler (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_sharded_grad_scaler_42praqrh) 2022-11-23T03:31:48.7938785Z 2022-11-23T03:31:48.7938923Z Running tests... 2022-11-23T03:31:48.7939440Z ---------------------------------------------------------------------- 2022-11-23T03:31:48.7940055Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_sharded_grad_scaler 2022-11-23T03:31:48.7940661Z test_grad_scaling (__main__.TestShardGradScaler) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:31:48.7943238Z ok (1.735s) 2022-11-23T03:31:48.7943676Z test_inf_gradients_skip_optim_step (__main__.TestShardGradScaler) ... ok (0.002s) 2022-11-23T03:31:48.7944508Z test_scaling_unscaling_sparse (__main__.TestShardGradScaler) ... ok (0.007s) 2022-11-23T03:31:48.7945139Z test_fsdp_ddp_parity_with_grad_scaler_offload_false_none_mixed_precision (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72820 2022-11-23T03:31:48.7945785Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72821 2022-11-23T03:31:48.7946238Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 72822 2022-11-23T03:31:48.7946637Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 72823 2022-11-23T03:31:48.7947311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7947790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7949599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7950183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7950714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7951243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7951867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7952244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7952931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7953386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7953871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7954627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7955169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7955686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7956236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7956722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7957166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.7957656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.7958216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.7958754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.7959455Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7960352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7961000Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7961813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7962328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.7962750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.7963248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.7963765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.7964181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7964709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7965217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7965621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7966997Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.7967799Z warnings.warn( 2022-11-23T03:31:48.7968942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.7969811Z warnings.warn( 2022-11-23T03:31:48.7970936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.7971786Z warnings.warn( 2022-11-23T03:31:48.7972938Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.7973661Z warnings.warn( 2022-11-23T03:31:48.7974081Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7974534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7975084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7975561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7975922Z dist init r=2, world=4 2022-11-23T03:31:48.7976098Z dist init r=1, world=4 2022-11-23T03:31:48.7976360Z dist init r=0, world=4 2022-11-23T03:31:48.7976714Z dist init r=3, world=4 2022-11-23T03:31:48.7976933Z ok (5.123s) 2022-11-23T03:31:48.7977388Z test_fsdp_ddp_parity_with_grad_scaler_offload_false_none_none (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73121 2022-11-23T03:31:48.7978165Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73122 2022-11-23T03:31:48.7978603Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 73123 2022-11-23T03:31:48.7978964Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 73124 2022-11-23T03:31:48.7979597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7980062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7980614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7981071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7981650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7982110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7982701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7983175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7983760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7984661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7985257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7985731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7986288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.7986744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.7987336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.7987800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.7988238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.7988835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.7989348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.7989844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.7990489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7991190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7991944Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7992570Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.7993070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.7993558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.7994036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.7994489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.7994978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7995560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7996045Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7996503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.7997792Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.7998571Z warnings.warn( 2022-11-23T03:31:48.7999731Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8000507Z warnings.warn( 2022-11-23T03:31:48.8001640Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8002407Z warnings.warn( 2022-11-23T03:31:48.8003561Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8004332Z warnings.warn( 2022-11-23T03:31:48.8004771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8005252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8005740Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8006229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8006597Z dist init r=0, world=4 2022-11-23T03:31:48.8006844Z dist init r=1, world=4 2022-11-23T03:31:48.8007103Z dist init r=2, world=4 2022-11-23T03:31:48.8007361Z dist init r=3, world=4 2022-11-23T03:31:48.8007584Z ok (4.921s) 2022-11-23T03:31:48.8008147Z test_fsdp_ddp_parity_with_grad_scaler_offload_false_shard_grad_op_mixed_precision (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73422 2022-11-23T03:31:48.8008819Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73423 2022-11-23T03:31:48.8009232Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 73424 2022-11-23T03:31:48.8009681Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 73425 2022-11-23T03:31:48.8010308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8010772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8011400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8011881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8012465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8012893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8013479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8013949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8014533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8014954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8015533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8016004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8016590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8017014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8017593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8018066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8018508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8019014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8019506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8020002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8020647Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8021342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8022119Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8022904Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8023406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8024238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8024734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8025190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8025694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8026183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8026581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8027047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8028391Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8029292Z warnings.warn( 2022-11-23T03:31:48.8030535Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8031386Z warnings.warn( 2022-11-23T03:31:48.8032582Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8033431Z warnings.warn( 2022-11-23T03:31:48.8034658Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8035497Z warnings.warn( 2022-11-23T03:31:48.8035900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8036395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8036921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8037441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8037833Z dist init r=1, world=4 2022-11-23T03:31:48.8038085Z dist init r=2, world=4 2022-11-23T03:31:48.8038358Z dist init r=0, world=4 2022-11-23T03:31:48.8038628Z dist init r=3, world=4 2022-11-23T03:31:48.8038863Z ok (5.021s) 2022-11-23T03:31:48.8039512Z test_fsdp_ddp_parity_with_grad_scaler_offload_false_shard_grad_op_none (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73723 2022-11-23T03:31:48.8040199Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73724 2022-11-23T03:31:48.8040667Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 73725 2022-11-23T03:31:48.8041157Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 73726 2022-11-23T03:31:48.8041822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8042374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8042949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8043429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8044025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8044457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8045037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8045514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8046178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8046656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8047178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8047652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8048238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8048662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8049241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8049718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8050157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8050671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8051171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8051669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8052310Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8053012Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8053704Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8054387Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8054894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8055376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8055857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8056304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8056853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8057356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8057840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8058301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8059650Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8060439Z warnings.warn( 2022-11-23T03:31:48.8061601Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8062440Z warnings.warn( 2022-11-23T03:31:48.8063574Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8064726Z warnings.warn( 2022-11-23T03:31:48.8065811Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8066669Z warnings.warn( 2022-11-23T03:31:48.8067129Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8067516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8068007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8068498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8068861Z dist init r=0, world=4 2022-11-23T03:31:48.8069105Z dist init r=1, world=4 2022-11-23T03:31:48.8069366Z dist init r=2, world=4 2022-11-23T03:31:48.8069624Z dist init r=3, world=4 2022-11-23T03:31:48.8069847Z ok (4.920s) 2022-11-23T03:31:48.8070391Z test_fsdp_ddp_parity_with_grad_scaler_offload_true_none_mixed_precision (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74024 2022-11-23T03:31:48.8071025Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74025 2022-11-23T03:31:48.8071459Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74026 2022-11-23T03:31:48.8071915Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74027 2022-11-23T03:31:48.8072538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8072999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8073647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8074168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8074802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8075284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8075879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8076385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8077009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8077475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8078073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8078559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8079175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8079660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8080377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8080885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8081343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8081880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8082414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8082950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8083628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8084367Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8085104Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8085802Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8086312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8086793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8087270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8087747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8088211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8088705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8089203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8089669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8091056Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8091792Z warnings.warn( 2022-11-23T03:31:48.8092955Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8093741Z warnings.warn( 2022-11-23T03:31:48.8094907Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8095650Z warnings.warn( 2022-11-23T03:31:48.8096800Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8097631Z warnings.warn( 2022-11-23T03:31:48.8097910Z File "", line 1, in 2022-11-23T03:31:48.8098267Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8098651Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8099037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8099420Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8099791Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8100137Z self.run() 2022-11-23T03:31:48.8100480Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8100837Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8101395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8101762Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8102273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8102674Z getattr(self, test_name)() 2022-11-23T03:31:48.8103203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8103681Z fn() 2022-11-23T03:31:48.8104262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8104688Z test(self, **param_kwargs) 2022-11-23T03:31:48.8105243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8105589Z return func(*args, **kwargs) 2022-11-23T03:31:48.8106045Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8106573Z self._test_fsdp_parity( 2022-11-23T03:31:48.8107002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8107414Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8108059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8108478Z output = model(*input) 2022-11-23T03:31:48.8108947Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8109426Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8109907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8110376Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8110924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8111324Z _lazy_init(state, module) 2022-11-23T03:31:48.8111837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8112232Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8112759Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8113147Z return func(*args, **kwargs) 2022-11-23T03:31:48.8113691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8114053Z p_assert( 2022-11-23T03:31:48.8114629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8115024Z traceback.print_stack() 2022-11-23T03:31:48.8115297Z File "", line 1, in 2022-11-23T03:31:48.8115671Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8116052Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8116332Z File "", line 1, in 2022-11-23T03:31:48.8116703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8117083Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8117369Z File "", line 1, in 2022-11-23T03:31:48.8117755Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8118625Z self.run() 2022-11-23T03:31:48.8118981Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8119341Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8119733Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8120108Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8120462Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8120843Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8121219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8121600Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8121966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8122375Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8122874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8123251Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8123638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8123982Z self.run() 2022-11-23T03:31:48.8124309Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8124652Z self.run() 2022-11-23T03:31:48.8125159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8125561Z getattr(self, test_name)() 2022-11-23T03:31:48.8125970Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8126358Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8126741Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8127182Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8127729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8128115Z fn() 2022-11-23T03:31:48.8128570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8128959Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8129461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8129860Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8130373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8130780Z test(self, **param_kwargs) 2022-11-23T03:31:48.8131307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8131681Z getattr(self, test_name)() 2022-11-23T03:31:48.8132198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8132655Z getattr(self, test_name)() 2022-11-23T03:31:48.8133181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8133561Z return func(*args, **kwargs) 2022-11-23T03:31:48.8134084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8134464Z fn() 2022-11-23T03:31:48.8134927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8135295Z fn() 2022-11-23T03:31:48.8135719Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8136122Z self._test_fsdp_parity( 2022-11-23T03:31:48.8136653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8137062Z test(self, **param_kwargs) 2022-11-23T03:31:48.8137593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8137960Z test(self, **param_kwargs) 2022-11-23T03:31:48.8138484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8138914Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8139430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8139847Z return func(*args, **kwargs) 2022-11-23T03:31:48.8140460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8140862Z return func(*args, **kwargs) 2022-11-23T03:31:48.8141375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8141844Z output = model(*input) 2022-11-23T03:31:48.8142294Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8142698Z self._test_fsdp_parity( 2022-11-23T03:31:48.8143145Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8143561Z self._test_fsdp_parity( 2022-11-23T03:31:48.8144455Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8144888Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8145427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8145871Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8146294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8146725Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8147283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8147802Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8148307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8148719Z output = model(*input) 2022-11-23T03:31:48.8149258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8149636Z output = model(*input) 2022-11-23T03:31:48.8150161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8150563Z _lazy_init(state, module) 2022-11-23T03:31:48.8151148Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8151522Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8152025Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8152415Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8152917Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8153340Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8153904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8154366Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8154920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8155383Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8155931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8156296Z return func(*args, **kwargs) 2022-11-23T03:31:48.8156823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8157223Z _lazy_init(state, module) 2022-11-23T03:31:48.8157855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8158139Z _lazy_init(state, module) 2022-11-23T03:31:48.8158830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8159227Z p_assert( 2022-11-23T03:31:48.8159702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8160122Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8160657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8161069Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8161566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8161967Z traceback.print_stack() 2022-11-23T03:31:48.8162620Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8163004Z return func(*args, **kwargs) 2022-11-23T03:31:48.8163561Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8163888Z return func(*args, **kwargs) 2022-11-23T03:31:48.8164402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8164799Z p_assert( 2022-11-23T03:31:48.8165395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8165703Z p_assert( 2022-11-23T03:31:48.8166153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8166542Z traceback.print_stack() 2022-11-23T03:31:48.8167044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8167416Z traceback.print_stack() 2022-11-23T03:31:48.8167828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8168321Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8168809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8169341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8169826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8170309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8170769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8171251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8171616Z dist init r=2, world=4 2022-11-23T03:31:48.8172104Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:31:48.8172538Z dist init r=3, world=4 2022-11-23T03:31:48.8173010Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:31:48.8173460Z dist init r=1, world=4 2022-11-23T03:31:48.8173906Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:31:48.8174356Z dist init r=0, world=4 2022-11-23T03:31:48.8174825Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:31:48.8175261Z ok (5.021s) 2022-11-23T03:31:48.8175773Z test_fsdp_ddp_parity_with_grad_scaler_offload_true_none_none (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74325 2022-11-23T03:31:48.8176391Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74326 2022-11-23T03:31:48.8176846Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74327 2022-11-23T03:31:48.8177301Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74328 2022-11-23T03:31:48.8177911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8178367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8178957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8179417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8180065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8180530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8181116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8181566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8182160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8182611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8183166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8183643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8184643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8185096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8185661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8186133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8186593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8187108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8187587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8188088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8188758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8189431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8190124Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8190812Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8191348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8191810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8192278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8192757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8193250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8193723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8194217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8194700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8195983Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8196750Z warnings.warn( 2022-11-23T03:31:48.8197976Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8198773Z warnings.warn( 2022-11-23T03:31:48.8199932Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8200704Z warnings.warn( 2022-11-23T03:31:48.8201862Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8202675Z warnings.warn( 2022-11-23T03:31:48.8202955Z File "", line 1, in 2022-11-23T03:31:48.8203343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8203701Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8204010Z File "", line 1, in 2022-11-23T03:31:48.8204388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8204772Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8205153Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8205493Z self.run() 2022-11-23T03:31:48.8205840Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8206198Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8206588Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8206971Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8207330Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8207706Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8208244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8208646Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8209011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8209361Z self.run() 2022-11-23T03:31:48.8209873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8210414Z getattr(self, test_name)() 2022-11-23T03:31:48.8210963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8211583Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8212230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8212617Z fn() 2022-11-23T03:31:48.8213154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8213619Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8214257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8214892Z test(self, **param_kwargs) 2022-11-23T03:31:48.8215507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8215914Z getattr(self, test_name)() 2022-11-23T03:31:48.8216532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8217051Z return func(*args, **kwargs) 2022-11-23T03:31:48.8217643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8218024Z fn() 2022-11-23T03:31:48.8218505Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8219054Z self._test_fsdp_parity( 2022-11-23T03:31:48.8219600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8220081Z test(self, **param_kwargs) 2022-11-23T03:31:48.8220673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8221160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8221759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8222325Z return func(*args, **kwargs) 2022-11-23T03:31:48.8222938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8223350Z output = model(*input) 2022-11-23T03:31:48.8224114Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8224753Z self._test_fsdp_parity( 2022-11-23T03:31:48.8225265Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8225752Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8226369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8226915Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8227494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8228034Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8228746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8229272Z output = model(*input) 2022-11-23T03:31:48.8229766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8230231Z _lazy_init(state, module) 2022-11-23T03:31:48.8230790Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8231211Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8231811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8232334Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8232974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8233455Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8234097Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8234588Z return func(*args, **kwargs) 2022-11-23T03:31:48.8235133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8235704Z _lazy_init(state, module) 2022-11-23T03:31:48.8236342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8236798Z p_assert( 2022-11-23T03:31:48.8237342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8237831Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8259860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8260340Z traceback.print_stack() 2022-11-23T03:31:48.8260940Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8261345Z return func(*args, **kwargs) 2022-11-23T03:31:48.8261867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8262260Z p_assert( 2022-11-23T03:31:48.8262744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8263150Z traceback.print_stack() 2022-11-23T03:31:48.8263419Z File "", line 1, in 2022-11-23T03:31:48.8263725Z File "", line 1, in 2022-11-23T03:31:48.8264585Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8265155Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8265567Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8265975Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8266283Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8266686Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8267103Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8267468Z self.run() 2022-11-23T03:31:48.8267808Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8268213Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8268626Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8269001Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8269423Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8269790Z self.run() 2022-11-23T03:31:48.8270301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8270730Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8271127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8271504Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8272089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8272522Z getattr(self, test_name)() 2022-11-23T03:31:48.8273057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8273452Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8274023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8274430Z fn() 2022-11-23T03:31:48.8274932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8275365Z getattr(self, test_name)() 2022-11-23T03:31:48.8275929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8276387Z test(self, **param_kwargs) 2022-11-23T03:31:48.8277000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8277421Z fn() 2022-11-23T03:31:48.8277946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8278357Z return func(*args, **kwargs) 2022-11-23T03:31:48.8278919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8279358Z test(self, **param_kwargs) 2022-11-23T03:31:48.8279843Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8280275Z self._test_fsdp_parity( 2022-11-23T03:31:48.8280833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8281263Z return func(*args, **kwargs) 2022-11-23T03:31:48.8281802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8282263Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8282769Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8283228Z self._test_fsdp_parity( 2022-11-23T03:31:48.8283777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8284278Z output = model(*input) 2022-11-23T03:31:48.8284835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8285261Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8285806Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8286234Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8286828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8287238Z output = model(*input) 2022-11-23T03:31:48.8287813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8288318Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8288870Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8289291Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8289859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8290279Z _lazy_init(state, module) 2022-11-23T03:31:48.8290821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8291318Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8291914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8292331Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8292921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8293356Z _lazy_init(state, module) 2022-11-23T03:31:48.8293888Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8294278Z return func(*args, **kwargs) 2022-11-23T03:31:48.8294832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8295275Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8295906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8296323Z p_assert( 2022-11-23T03:31:48.8296834Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8297232Z return func(*args, **kwargs) 2022-11-23T03:31:48.8297763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8298188Z traceback.print_stack() 2022-11-23T03:31:48.8298771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8299165Z p_assert( 2022-11-23T03:31:48.8299670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8300085Z traceback.print_stack() 2022-11-23T03:31:48.8300493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8301015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8301540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8302156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8302565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8303157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8303676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8304741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8305102Z dist init r=0, world=4 2022-11-23T03:31:48.8305523Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:31:48.8305989Z dist init r=2, world=4 2022-11-23T03:31:48.8306438Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:31:48.8306889Z dist init r=3, world=4 2022-11-23T03:31:48.8307368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:31:48.8307803Z dist init r=1, world=4 2022-11-23T03:31:48.8308278Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:31:48.8308715Z ok (5.020s) 2022-11-23T03:31:48.8309279Z test_fsdp_ddp_parity_with_grad_scaler_offload_true_shard_grad_op_mixed_precision (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74626 2022-11-23T03:31:48.8309962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74627 2022-11-23T03:31:48.8310380Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74628 2022-11-23T03:31:48.8310841Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74629 2022-11-23T03:31:48.8311482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8311928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8312506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8312977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8313537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8314092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8314691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8315168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8315724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8316197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8316777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8317230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8317813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8318273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8318854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8319305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8319774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8320289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8320918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8321399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8322063Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8322849Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8323611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8324197Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8324722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8325212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8325669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8326147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8326634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8327185Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8327673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8328091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8329455Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8330160Z warnings.warn( 2022-11-23T03:31:48.8331349Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8332131Z warnings.warn( 2022-11-23T03:31:48.8333269Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8334042Z warnings.warn( 2022-11-23T03:31:48.8335181Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8335948Z warnings.warn( 2022-11-23T03:31:48.8336195Z File "", line 1, in 2022-11-23T03:31:48.8336632Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8337004Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8337359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8337740Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8338130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8338473Z self.run() 2022-11-23T03:31:48.8338803Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8339183Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8339713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8340095Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8340641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8341056Z getattr(self, test_name)() 2022-11-23T03:31:48.8341561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8341943Z fn() 2022-11-23T03:31:48.8342529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8342970Z test(self, **param_kwargs) 2022-11-23T03:31:48.8343440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8344091Z return func(*args, **kwargs) 2022-11-23T03:31:48.8344660Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8345076Z self._test_fsdp_parity( 2022-11-23T03:31:48.8345622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8346067Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8346619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8346922Z output = model(*input) 2022-11-23T03:31:48.8347450Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8347811Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8348512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8348926Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8349507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8349917Z _lazy_init(state, module) 2022-11-23T03:31:48.8350412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8350832Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8351359Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8351729Z return func(*args, **kwargs) 2022-11-23T03:31:48.8352274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8352669Z p_assert( 2022-11-23T03:31:48.8353156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8353525Z traceback.print_stack() 2022-11-23T03:31:48.8353826Z File "", line 1, in 2022-11-23T03:31:48.8354211Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8354666Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8355046Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8355434Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8355810Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8356163Z self.run() 2022-11-23T03:31:48.8356508Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8356889Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8357397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8357899Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8358338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8358716Z getattr(self, test_name)() 2022-11-23T03:31:48.8359073Z File "", line 1, in 2022-11-23T03:31:48.8359617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8359973Z fn() 2022-11-23T03:31:48.8360537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8360880Z test(self, **param_kwargs) 2022-11-23T03:31:48.8361256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8361616Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8362161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8362564Z return func(*args, **kwargs) 2022-11-23T03:31:48.8362905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8363291Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8363768Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8364202Z self._test_fsdp_parity( 2022-11-23T03:31:48.8364555Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8364902Z self.run() 2022-11-23T03:31:48.8365498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8365899Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8366293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8366694Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8367232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8367644Z output = model(*input) 2022-11-23T03:31:48.8368148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8368555Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8369029Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8369431Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8369970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8370343Z getattr(self, test_name)() 2022-11-23T03:31:48.8370889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8371357Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8371923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8372275Z fn() 2022-11-23T03:31:48.8372776Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8373252Z _lazy_init(state, module) 2022-11-23T03:31:48.8373757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8374164Z test(self, **param_kwargs) 2022-11-23T03:31:48.8374684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8375107Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8375630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8376033Z return func(*args, **kwargs) 2022-11-23T03:31:48.8376537Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8376905Z return func(*args, **kwargs) 2022-11-23T03:31:48.8377356Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8377789Z self._test_fsdp_parity( 2022-11-23T03:31:48.8378312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8378706Z p_assert( 2022-11-23T03:31:48.8379215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8379648Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8380156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8380550Z traceback.print_stack() 2022-11-23T03:31:48.8381097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8381476Z output = model(*input) 2022-11-23T03:31:48.8381972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8382467Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8383001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8383373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8384260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8384660Z _lazy_init(state, module) 2022-11-23T03:31:48.8385169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8385590Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8386112Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8386505Z return func(*args, **kwargs) 2022-11-23T03:31:48.8387028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8387423Z p_assert( 2022-11-23T03:31:48.8387695Z File "", line 1, in 2022-11-23T03:31:48.8388179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8388573Z traceback.print_stack() 2022-11-23T03:31:48.8388958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8389321Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8389704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8390088Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8390488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8390808Z self.run() 2022-11-23T03:31:48.8391239Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8391620Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8392230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8392533Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8393068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8393479Z getattr(self, test_name)() 2022-11-23T03:31:48.8393983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8394362Z fn() 2022-11-23T03:31:48.8394864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8395246Z test(self, **param_kwargs) 2022-11-23T03:31:48.8395771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8396182Z return func(*args, **kwargs) 2022-11-23T03:31:48.8396611Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8397050Z self._test_fsdp_parity( 2022-11-23T03:31:48.8397584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8398018Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8398561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8398970Z output = model(*input) 2022-11-23T03:31:48.8399460Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8399833Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8400397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8400870Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8401448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8401829Z _lazy_init(state, module) 2022-11-23T03:31:48.8402344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8402825Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8403337Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8403732Z return func(*args, **kwargs) 2022-11-23T03:31:48.8404278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8404679Z p_assert( 2022-11-23T03:31:48.8405138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8405544Z traceback.print_stack() 2022-11-23T03:31:48.8405950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8406425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8406920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8407426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8407919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8408388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8408877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8409490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8409790Z dist init r=1, world=4 2022-11-23T03:31:48.8410366Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:31:48.8410737Z dist init r=0, world=4 2022-11-23T03:31:48.8411220Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:31:48.8411651Z dist init r=2, world=4 2022-11-23T03:31:48.8412125Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:31:48.8412572Z dist init r=3, world=4 2022-11-23T03:31:48.8413019Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:31:48.8413465Z ok (5.120s) 2022-11-23T03:31:48.8414010Z test_fsdp_ddp_parity_with_grad_scaler_offload_true_shard_grad_op_none (__main__.TestShardedGradScalerParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74927 2022-11-23T03:31:48.8414654Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74928 2022-11-23T03:31:48.8415091Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74929 2022-11-23T03:31:48.8415552Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74930 2022-11-23T03:31:48.8416184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8416652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8417213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8417702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8418293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8418724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8419310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8419851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8420447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8420871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8421451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8421926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8422489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:31:48.8422943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:31:48.8423572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:31:48.8424325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:31:48.8424820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:31:48.8425328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:31:48.8425855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:31:48.8426262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:31:48.8427024Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8427722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8428418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8429115Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:31:48.8429621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:31:48.8430158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:31:48.8430666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:31:48.8431045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:31:48.8431536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8432031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8432522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8432995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8434281Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8435080Z warnings.warn( 2022-11-23T03:31:48.8436310Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8437095Z warnings.warn( 2022-11-23T03:31:48.8438213Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8439003Z warnings.warn( 2022-11-23T03:31:48.8440144Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:31:48.8440921Z warnings.warn( 2022-11-23T03:31:48.8441205Z File "", line 1, in 2022-11-23T03:31:48.8441563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8441948Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8442398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8442859Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8443265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8443608Z self.run() 2022-11-23T03:31:48.8443954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8444307Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8444851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8445250Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8445761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8446170Z getattr(self, test_name)() 2022-11-23T03:31:48.8446760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8447119Z fn() 2022-11-23T03:31:48.8447622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8448031Z test(self, **param_kwargs) 2022-11-23T03:31:48.8448555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8448958Z return func(*args, **kwargs) 2022-11-23T03:31:48.8449394Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8449827Z self._test_fsdp_parity( 2022-11-23T03:31:48.8450336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8450772Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8451089Z File "", line 1, in 2022-11-23T03:31:48.8451645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8452028Z output = model(*input) 2022-11-23T03:31:48.8452524Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8452927Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8453289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8453674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8454347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8454833Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8455217Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8455600Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8456164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8456546Z _lazy_init(state, module) 2022-11-23T03:31:48.8456926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8457277Z self.run() 2022-11-23T03:31:48.8457748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8458168Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8458560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8459003Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8459500Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8459893Z return func(*args, **kwargs) 2022-11-23T03:31:48.8460397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8460900Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8461409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8461798Z p_assert( 2022-11-23T03:31:48.8462313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8462693Z getattr(self, test_name)() 2022-11-23T03:31:48.8463201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8463596Z traceback.print_stack() 2022-11-23T03:31:48.8464528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8464902Z fn() 2022-11-23T03:31:48.8465383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8465757Z test(self, **param_kwargs) 2022-11-23T03:31:48.8466292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8466590Z return func(*args, **kwargs) 2022-11-23T03:31:48.8467015Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8467435Z self._test_fsdp_parity( 2022-11-23T03:31:48.8467960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8468389Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8468924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8469321Z output = model(*input) 2022-11-23T03:31:48.8469804Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8470183Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8470728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8471191Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8471743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8472141Z _lazy_init(state, module) 2022-11-23T03:31:48.8472733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8473157Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8473658Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8474075Z return func(*args, **kwargs) 2022-11-23T03:31:48.8474584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8474944Z p_assert( 2022-11-23T03:31:48.8475414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8475795Z traceback.print_stack() 2022-11-23T03:31:48.8476086Z File "", line 1, in 2022-11-23T03:31:48.8476438Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8476814Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8477189Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8477543Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8477932Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8478268Z self.run() 2022-11-23T03:31:48.8478582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8479067Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8479567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8479959Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8480463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8480863Z getattr(self, test_name)() 2022-11-23T03:31:48.8481381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8481731Z fn() 2022-11-23T03:31:48.8482221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8482618Z test(self, **param_kwargs) 2022-11-23T03:31:48.8483112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8483513Z return func(*args, **kwargs) 2022-11-23T03:31:48.8483964Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8484384Z self._test_fsdp_parity( 2022-11-23T03:31:48.8484883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8485302Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8485862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8486320Z output = model(*input) 2022-11-23T03:31:48.8486777Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8487166Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8487716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8488158Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8488721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8489116Z _lazy_init(state, module) 2022-11-23T03:31:48.8489471Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8489672Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8490005Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8490135Z return func(*args, **kwargs) 2022-11-23T03:31:48.8490514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8490620Z p_assert( 2022-11-23T03:31:48.8490964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8491092Z traceback.print_stack() 2022-11-23T03:31:48.8491224Z File "", line 1, in 2022-11-23T03:31:48.8491415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T03:31:48.8491559Z exitcode = _main(fd, parent_sentinel) 2022-11-23T03:31:48.8491762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T03:31:48.8491918Z return self._bootstrap(parent_sentinel) 2022-11-23T03:31:48.8492132Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T03:31:48.8492237Z self.run() 2022-11-23T03:31:48.8492441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T03:31:48.8492678Z self._target(*self._args, **self._kwargs) 2022-11-23T03:31:48.8492943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T03:31:48.8493111Z self.run_test(test_name, pipe) 2022-11-23T03:31:48.8493477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T03:31:48.8493603Z getattr(self, test_name)() 2022-11-23T03:31:48.8493997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T03:31:48.8494064Z fn() 2022-11-23T03:31:48.8494433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T03:31:48.8494558Z test(self, **param_kwargs) 2022-11-23T03:31:48.8494895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T03:31:48.8495071Z return func(*args, **kwargs) 2022-11-23T03:31:48.8495319Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_sharded_grad_scaler.py", line 171, in test_fsdp_ddp_parity_with_grad_scaler 2022-11-23T03:31:48.8495446Z self._test_fsdp_parity( 2022-11-23T03:31:48.8495811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T03:31:48.8495968Z fsdp_loss = self._train_for_several_steps( 2022-11-23T03:31:48.8496347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T03:31:48.8496469Z output = model(*input) 2022-11-23T03:31:48.8496777Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T03:31:48.8496918Z return forward_call(*input, **kwargs) 2022-11-23T03:31:48.8497295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T03:31:48.8497477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T03:31:48.8497850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T03:31:48.8497975Z _lazy_init(state, module) 2022-11-23T03:31:48.8498326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T03:31:48.8498473Z handle.init_flat_param_attributes() 2022-11-23T03:31:48.8498791Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T03:31:48.8498968Z return func(*args, **kwargs) 2022-11-23T03:31:48.8499361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T03:31:48.8499466Z p_assert( 2022-11-23T03:31:48.8499805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T03:31:48.8499933Z traceback.print_stack() 2022-11-23T03:31:48.8500179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8500417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8500626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8500863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8501094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8501328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8501556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8501781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:31:48.8501894Z dist init r=3, world=4 2022-11-23T03:31:48.8502290Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:3 after the FSDP constructor. 2022-11-23T03:31:48.8502383Z dist init r=0, world=4 2022-11-23T03:31:48.8502716Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T03:31:48.8502827Z dist init r=2, world=4 2022-11-23T03:31:48.8503146Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:2 after the FSDP constructor. 2022-11-23T03:31:48.8503258Z dist init r=1, world=4 2022-11-23T03:31:48.8503585Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T03:31:48.8503690Z ok (5.122s) 2022-11-23T03:31:48.8503714Z 2022-11-23T03:31:48.8504340Z ---------------------------------------------------------------------- 2022-11-23T03:31:48.8504462Z Ran 11 tests in 42.014s 2022-11-23T03:31:48.8504462Z 2022-11-23T03:31:48.8504504Z OK 2022-11-23T03:31:48.8504523Z 2022-11-23T03:31:48.8504750Z Generating XML reports... 2022-11-23T03:31:48.8505232Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_sharded_grad_scaler/TEST-TestShardGradScaler-20221123033106.xml 2022-11-23T03:31:48.8505781Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_sharded_grad_scaler/TEST-TestShardedGradScalerParityWithDDP-20221123033106.xml 2022-11-23T03:31:48.8505801Z 2022-11-23T03:31:48.8506308Z ##[endgroup] 2022-11-23T03:31:48.8506822Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_sharded_grad_scaler (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_sharded_grad_scaler_42praqrh) 2022-11-23T03:31:48.8506842Z 2022-11-23T03:31:49.1478524Z 2022-11-23T03:31:49.1478846Z real 0m49.942s 2022-11-23T03:31:49.1479013Z user 2m30.407s 2022-11-23T03:31:49.1479407Z sys 1m40.272s 2022-11-23T03:31:49.1479709Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:31:49.1480214Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_state_dict.py 2022-11-23T03:31:51.5198537Z Ignoring disabled issues: [] 2022-11-23T03:31:51.5730381Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:31:51.5730952Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:31:51.5731284Z Selected tests: 2022-11-23T03:31:51.5731875Z distributed/fsdp/test_fsdp_state_dict.py 2022-11-23T03:31:51.5759841Z Prioritized test from test file changes. 2022-11-23T03:31:51.5760409Z reordering tests for PR: 2022-11-23T03:31:51.5760676Z prioritized: [] 2022-11-23T03:31:51.5761183Z the rest: ['distributed/fsdp/test_fsdp_state_dict.py'] 2022-11-23T03:31:51.5761404Z 2022-11-23T03:31:51.5761935Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:31:51.5762873Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:31:51.5769379Z parallel (file granularity) tests: 2022-11-23T03:31:51.5769668Z 2022-11-23T03:31:51.5770180Z serial (file granularity) tests: 2022-11-23T03:31:51.5770480Z distributed/fsdp/test_fsdp_state_dict.py 2022-11-23T03:31:53.8549738Z Ignoring disabled issues: [] 2022-11-23T03:31:54.2744774Z Running distributed/fsdp/test_fsdp_state_dict.py ... [2022-11-23 03:31:54.273803] 2022-11-23T03:31:54.2746221Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:31:54.274253] 2022-11-23T03:40:07.7044052Z 2022-11-23T03:40:07.7044682Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_state_dict 2022-11-23T03:40:07.7047253Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_state_dict (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_state_dict_aav7zu6g) 2022-11-23T03:40:07.7049153Z 2022-11-23T03:40:07.7049722Z Running tests... 2022-11-23T03:40:07.7050350Z ---------------------------------------------------------------------- 2022-11-23T03:40:07.7050944Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict 2022-11-23T03:40:07.7051657Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7052316Z Tests that we can save a state_dict and load it into a blank model ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:40:07.7052788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75440 2022-11-23T03:40:07.7053256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75441 2022-11-23T03:40:07.7053876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7054354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7054917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7055402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7055977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7056430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7057012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7057481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7057947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7058436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7059097Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7060088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7060637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7061099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7061451Z dist init r=0, world=2 2022-11-23T03:40:07.7061701Z dist init r=1, world=2 2022-11-23T03:40:07.7063593Z ok (5.952s) 2022-11-23T03:40:07.7064472Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7065209Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75583 2022-11-23T03:40:07.7065748Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75584 2022-11-23T03:40:07.7066420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7066868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7067448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7067914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7068686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7069114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7069694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7070155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7070601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7071101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7071765Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7072446Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7072949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7073429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7073787Z dist init r=1, world=2 2022-11-23T03:40:07.7074022Z dist init r=0, world=2 2022-11-23T03:40:07.7074485Z ok (4.016s) 2022-11-23T03:40:07.7075010Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7075704Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75726 2022-11-23T03:40:07.7076209Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75727 2022-11-23T03:40:07.7076832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7077279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7077857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7078304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7078887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7079429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7080009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7080452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7080903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7081412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7082048Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7082731Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7083253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7083737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7084071Z dist init r=0, world=2 2022-11-23T03:40:07.7084328Z dist init r=1, world=2 2022-11-23T03:40:07.7084575Z ok (4.016s) 2022-11-23T03:40:07.7085072Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7085817Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75869 2022-11-23T03:40:07.7086353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75870 2022-11-23T03:40:07.7086956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7087391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7087968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7088441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7089024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7089454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7090034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7090496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7090956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7091433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7092099Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7092791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7093307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7093765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7094132Z dist init r=0, world=2 2022-11-23T03:40:07.7094375Z dist init r=1, world=2 2022-11-23T03:40:07.7094599Z ok (4.017s) 2022-11-23T03:40:07.7095124Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7095862Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76012 2022-11-23T03:40:07.7096401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76013 2022-11-23T03:40:07.7096994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7097436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7098016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7098471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7099062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7099502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7100076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7100534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7100997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7101493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7102144Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7102875Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7103393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7104199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7104566Z dist init r=0, world=2 2022-11-23T03:40:07.7104809Z dist init r=1, world=2 2022-11-23T03:40:07.7105058Z ok (4.117s) 2022-11-23T03:40:07.7105578Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7106249Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76155 2022-11-23T03:40:07.7106782Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76156 2022-11-23T03:40:07.7107403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7107847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7108403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7108878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7109457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7109892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7110456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7110915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7111376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7111854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7112505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7113279Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7113808Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7114256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7114612Z dist init r=0, world=2 2022-11-23T03:40:07.7114861Z dist init r=1, world=2 2022-11-23T03:40:07.7115088Z ok (4.117s) 2022-11-23T03:40:07.7115611Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7116289Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76298 2022-11-23T03:40:07.7116818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76299 2022-11-23T03:40:07.7117423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7117870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7118442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7118910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7119553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7119994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7120559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7121009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7121465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7121965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7122612Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7123274Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7123803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7124272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7124635Z dist init r=0, world=2 2022-11-23T03:40:07.7124866Z dist init r=1, world=2 2022-11-23T03:40:07.7125115Z ok (4.014s) 2022-11-23T03:40:07.7125619Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7126290Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76441 2022-11-23T03:40:07.7126814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76442 2022-11-23T03:40:07.7127436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7127880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7128439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7128906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7129483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7129956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7130535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7130996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7131440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7131915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7132562Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7133245Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7133764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7134218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7134566Z dist init r=0, world=2 2022-11-23T03:40:07.7134817Z dist init r=1, world=2 2022-11-23T03:40:07.7135035Z ok (4.016s) 2022-11-23T03:40:07.7135548Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7136287Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76584 2022-11-23T03:40:07.7136811Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76585 2022-11-23T03:40:07.7137401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7137844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7138419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7138887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7139509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7139950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7140520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7140963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7141414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7141909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7142564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7143230Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7143802Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7144584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7144925Z dist init r=0, world=2 2022-11-23T03:40:07.7145177Z dist init r=1, world=2 2022-11-23T03:40:07.7145417Z ok (4.117s) 2022-11-23T03:40:07.7145926Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7146672Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76727 2022-11-23T03:40:07.7147205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76728 2022-11-23T03:40:07.7147822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7148268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7148819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7149285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7149858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7150276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7150842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7152622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7153099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7153571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7154234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7155027Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7155548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7155997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7156346Z dist init r=0, world=2 2022-11-23T03:40:07.7156599Z dist init r=1, world=2 2022-11-23T03:40:07.7156822Z ok (4.016s) 2022-11-23T03:40:07.7157339Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7158025Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76870 2022-11-23T03:40:07.7158552Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76871 2022-11-23T03:40:07.7159137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7159587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7160156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7160605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7161185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7161686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7162255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7162704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7163153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7163643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7164295Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7165017Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7165541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7166003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7166340Z dist init r=0, world=2 2022-11-23T03:40:07.7166591Z dist init r=1, world=2 2022-11-23T03:40:07.7166831Z ok (4.216s) 2022-11-23T03:40:07.7167325Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7168009Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77013 2022-11-23T03:40:07.7168532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77014 2022-11-23T03:40:07.7169141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7169575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7170149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7170610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7171263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7171685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7172255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7172716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7173168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7173649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7174304Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7174986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7175489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7175957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7176308Z dist init r=0, world=2 2022-11-23T03:40:07.7176559Z dist init r=1, world=2 2022-11-23T03:40:07.7176780Z ok (4.116s) 2022-11-23T03:40:07.7177294Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7177978Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77156 2022-11-23T03:40:07.7178484Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77157 2022-11-23T03:40:07.7179090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7179544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7180113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7180563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7181142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7181635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7182205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7182647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7183102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7183605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7184506Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7185188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7185705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7186177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7186515Z dist init r=1, world=2 2022-11-23T03:40:07.7186763Z dist init r=0, world=2 2022-11-23T03:40:07.7187004Z ok (4.217s) 2022-11-23T03:40:07.7187493Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7188256Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77299 2022-11-23T03:40:07.7188779Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77300 2022-11-23T03:40:07.7189391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7189820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7190396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7190862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7191439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7191858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7192432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7192895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7193338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7193819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7194470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7195144Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7195643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7196111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7196470Z dist init r=0, world=2 2022-11-23T03:40:07.7196704Z dist init r=1, world=2 2022-11-23T03:40:07.7196943Z ok (4.116s) 2022-11-23T03:40:07.7197447Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7198187Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77442 2022-11-23T03:40:07.7198700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77443 2022-11-23T03:40:07.7199307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7199750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7200319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7200771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7201347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7201790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7202340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7202807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7203257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7203748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7204383Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7205124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7205643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7206113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7206451Z dist init r=0, world=2 2022-11-23T03:40:07.7206704Z dist init r=1, world=2 2022-11-23T03:40:07.7206946Z ok (4.116s) 2022-11-23T03:40:07.7207438Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7208121Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77585 2022-11-23T03:40:07.7208647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77586 2022-11-23T03:40:07.7209252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7209678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7210247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7210709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7211268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7211710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7212272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7212733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7213162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7213656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7214304Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7215036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7215538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7216008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7216363Z dist init r=0, world=2 2022-11-23T03:40:07.7216595Z dist init r=1, world=2 2022-11-23T03:40:07.7216836Z ok (4.116s) 2022-11-23T03:40:07.7217363Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7218054Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77728 2022-11-23T03:40:07.7218563Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77729 2022-11-23T03:40:07.7219176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7219626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7220209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7220660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7221290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7221729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7222275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7222737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7223189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7223684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7224577Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7225264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7225786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7226257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7226592Z dist init r=0, world=2 2022-11-23T03:40:07.7226843Z dist init r=1, world=2 2022-11-23T03:40:07.7227082Z ok (4.216s) 2022-11-23T03:40:07.7227580Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7228270Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77871 2022-11-23T03:40:07.7228793Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77872 2022-11-23T03:40:07.7229402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7229835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7230402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7230866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7231422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7231933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7232510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7232971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7233407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7233906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7234561Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7235250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7235749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7236224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7236580Z dist init r=1, world=2 2022-11-23T03:40:07.7236814Z dist init r=0, world=2 2022-11-23T03:40:07.7237047Z ok (4.216s) 2022-11-23T03:40:07.7237564Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7238317Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78014 2022-11-23T03:40:07.7238823Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78015 2022-11-23T03:40:07.7239490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7239939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7240498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7241057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7241627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7242068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7242624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7243087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7243584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7244080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7244722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7245400Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7245917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7246366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7246723Z dist init r=1, world=2 2022-11-23T03:40:07.7246972Z dist init r=0, world=2 2022-11-23T03:40:07.7247193Z ok (4.016s) 2022-11-23T03:40:07.7247709Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7249738Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78157 2022-11-23T03:40:07.7250274Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78158 2022-11-23T03:40:07.7250885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7251316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7251887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7252354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7252910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7253352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7253919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7254384Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7254883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7255376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7256024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7256767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7257265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7257736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7258087Z dist init r=1, world=2 2022-11-23T03:40:07.7258320Z dist init r=0, world=2 2022-11-23T03:40:07.7258560Z ok (4.117s) 2022-11-23T03:40:07.7259077Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7259767Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78300 2022-11-23T03:40:07.7260277Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78301 2022-11-23T03:40:07.7260881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7261327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7261878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7262342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7262922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7263357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7264100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7264588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7265038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7265558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7266219Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7266977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7267505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7267955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7268307Z dist init r=0, world=2 2022-11-23T03:40:07.7268558Z dist init r=1, world=2 2022-11-23T03:40:07.7268780Z ok (4.216s) 2022-11-23T03:40:07.7269303Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7269990Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78443 2022-11-23T03:40:07.7270509Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78444 2022-11-23T03:40:07.7271104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7271545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7272198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7272668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7273305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7273750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7274315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7274779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7275208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7275706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7276359Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7277025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7277556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7278018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7278368Z dist init r=0, world=2 2022-11-23T03:40:07.7278601Z dist init r=1, world=2 2022-11-23T03:40:07.7278845Z ok (4.116s) 2022-11-23T03:40:07.7279362Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7280125Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78586 2022-11-23T03:40:07.7280650Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78587 2022-11-23T03:40:07.7281255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7281706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7282258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7282718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7283292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7283783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7284341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7284798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7285248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7285726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7286377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7287059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7287575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7288028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7288375Z dist init r=1, world=2 2022-11-23T03:40:07.7288627Z dist init r=0, world=2 2022-11-23T03:40:07.7288849Z ok (4.116s) 2022-11-23T03:40:07.7289361Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7290111Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78729 2022-11-23T03:40:07.7290630Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78730 2022-11-23T03:40:07.7291217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7291701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7292275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7292747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7293303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7293740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7294314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7294755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7295213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7295702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7296357Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7297023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7297541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7298006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7298365Z dist init r=1, world=2 2022-11-23T03:40:07.7298599Z dist init r=0, world=2 2022-11-23T03:40:07.7298839Z ok (4.115s) 2022-11-23T03:40:07.7299348Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7300061Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78872 2022-11-23T03:40:07.7300590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78873 2022-11-23T03:40:07.7301192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7301635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7302186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7302653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7303228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7303656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7304513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7304986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7305446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7305928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7306596Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7307445Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7307971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7308422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7308787Z dist init r=1, world=2 2022-11-23T03:40:07.7309050Z dist init r=0, world=2 2022-11-23T03:40:07.7309278Z ok (4.217s) 2022-11-23T03:40:07.7309808Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7310510Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79015 2022-11-23T03:40:07.7311044Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79016 2022-11-23T03:40:07.7311635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7312091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7312672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7313131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7313728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7314174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7314803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7315257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7315716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7316220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7316877Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7317609Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7318145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7318631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7318973Z dist init r=1, world=2 2022-11-23T03:40:07.7319236Z dist init r=0, world=2 2022-11-23T03:40:07.7319493Z ok (4.216s) 2022-11-23T03:40:07.7341336Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7342102Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79158 2022-11-23T03:40:07.7342629Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79159 2022-11-23T03:40:07.7343298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7343803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7344669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7345138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7345863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7346306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7346874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7347337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7347774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7348278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7348937Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7349634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7350147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7350619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7350978Z dist init r=1, world=2 2022-11-23T03:40:07.7351213Z dist init r=0, world=2 2022-11-23T03:40:07.7351451Z ok (4.016s) 2022-11-23T03:40:07.7351968Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7352656Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79301 2022-11-23T03:40:07.7353161Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79302 2022-11-23T03:40:07.7353775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7354230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7354785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7355259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7355833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7356361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7356928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7357399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7357855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7358362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7358994Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7359680Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7360202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7360659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7361015Z dist init r=1, world=2 2022-11-23T03:40:07.7361269Z dist init r=0, world=2 2022-11-23T03:40:07.7361519Z ok (4.116s) 2022-11-23T03:40:07.7362026Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7362782Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79444 2022-11-23T03:40:07.7363316Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79445 2022-11-23T03:40:07.7363912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7364365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7364946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7365414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7365974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7366421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7366990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7367452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7367886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7368399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7369056Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7369721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7370241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7370709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7371068Z dist init r=0, world=2 2022-11-23T03:40:07.7371304Z dist init r=1, world=2 2022-11-23T03:40:07.7371543Z ok (4.116s) 2022-11-23T03:40:07.7372058Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7372771Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79587 2022-11-23T03:40:07.7373303Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79588 2022-11-23T03:40:07.7373912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7374364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7374917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7375391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7375965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7376408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7376956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7377425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7377880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7378364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7379021Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7379768Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7380288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7380740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7381089Z dist init r=0, world=2 2022-11-23T03:40:07.7381345Z dist init r=1, world=2 2022-11-23T03:40:07.7381567Z ok (4.216s) 2022-11-23T03:40:07.7382090Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7382771Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79730 2022-11-23T03:40:07.7383298Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79731 2022-11-23T03:40:07.7384139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7384612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7385207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7385687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7386256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7386702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7387270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7387721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7388180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7388680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7389331Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7390073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7390607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7391083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7391440Z dist init r=0, world=2 2022-11-23T03:40:07.7391674Z dist init r=1, world=2 2022-11-23T03:40:07.7391915Z ok (4.117s) 2022-11-23T03:40:07.7392433Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7393102Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79873 2022-11-23T03:40:07.7393623Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79874 2022-11-23T03:40:07.7394237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7394690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7395247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7395731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7396449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7396894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7397445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7397909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7398362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7398844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7399498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7400183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7400719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7401175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7401535Z dist init r=1, world=2 2022-11-23T03:40:07.7401800Z dist init r=0, world=2 2022-11-23T03:40:07.7402031Z ok (4.117s) 2022-11-23T03:40:07.7402554Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7403243Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80016 2022-11-23T03:40:07.7403782Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80017 2022-11-23T03:40:07.7404371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7404836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7405418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7405898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7406458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7406965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7407558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7408005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7408457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7408963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7409614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7410280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7410797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7411272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7411636Z dist init r=0, world=2 2022-11-23T03:40:07.7411866Z dist init r=1, world=2 2022-11-23T03:40:07.7412105Z ok (4.217s) 2022-11-23T03:40:07.7412613Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7413340Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80159 2022-11-23T03:40:07.7413864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80160 2022-11-23T03:40:07.7414477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7414925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7415486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7415953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7416527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7416949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7417519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7417981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7418435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7418914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7419570Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7420255Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7420774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7421226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7421581Z dist init r=0, world=2 2022-11-23T03:40:07.7421834Z dist init r=1, world=2 2022-11-23T03:40:07.7422058Z ok (4.116s) 2022-11-23T03:40:07.7422563Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7423236Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80302 2022-11-23T03:40:07.7424029Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80303 2022-11-23T03:40:07.7424660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7425111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7425687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7426149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7426721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7427164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7427730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7428178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7428631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7429131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7429786Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7430542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7431060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7431529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7431864Z dist init r=0, world=2 2022-11-23T03:40:07.7432115Z dist init r=1, world=2 2022-11-23T03:40:07.7432355Z ok (4.217s) 2022-11-23T03:40:07.7432844Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7433520Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80445 2022-11-23T03:40:07.7434048Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80446 2022-11-23T03:40:07.7434653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7435084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7435657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7436125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7436703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7437126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7437696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7438160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7438620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7439099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7439805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7440494Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7441054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7441535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7441882Z dist init r=0, world=2 2022-11-23T03:40:07.7442137Z dist init r=1, world=2 2022-11-23T03:40:07.7442357Z ok (4.217s) 2022-11-23T03:40:07.7442866Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7443588Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80588 2022-11-23T03:40:07.7444100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80589 2022-11-23T03:40:07.7444718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7445166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7445738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7446186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7446763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7447261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7447830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7448297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7448756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7449262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7449897Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7450585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7451109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7451579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7451913Z dist init r=0, world=2 2022-11-23T03:40:07.7452164Z dist init r=1, world=2 2022-11-23T03:40:07.7452404Z ok (4.116s) 2022-11-23T03:40:07.7452888Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7453575Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80731 2022-11-23T03:40:07.7454098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80732 2022-11-23T03:40:07.7454711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7455145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7455714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7456180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7456755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7457180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7457795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7458264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7458698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7459194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7459849Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7460535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7461037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7461511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7461868Z dist init r=1, world=2 2022-11-23T03:40:07.7462102Z dist init r=0, world=2 2022-11-23T03:40:07.7462344Z ok (4.217s) 2022-11-23T03:40:07.7462856Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7463586Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80874 2022-11-23T03:40:07.7464379Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80875 2022-11-23T03:40:07.7464997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7465447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7466024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7466473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7467050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7467493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7468049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7468514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7468965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7469465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7470101Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7470787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7471309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7471779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7472119Z dist init r=0, world=2 2022-11-23T03:40:07.7472369Z dist init r=1, world=2 2022-11-23T03:40:07.7472613Z ok (4.217s) 2022-11-23T03:40:07.7473098Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7473772Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81017 2022-11-23T03:40:07.7474363Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81018 2022-11-23T03:40:07.7474986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7475417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7475987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7476460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7477016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7477461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7478029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7478497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7478929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7479425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7480080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7480831Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7481332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7481803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7482159Z dist init r=0, world=2 2022-11-23T03:40:07.7482394Z dist init r=1, world=2 2022-11-23T03:40:07.7482639Z ok (4.116s) 2022-11-23T03:40:07.7483149Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7483827Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81160 2022-11-23T03:40:07.7484334Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81161 2022-11-23T03:40:07.7484950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7485400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7485955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7486429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7487008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7487450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7488006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7488468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7488922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7489422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7490056Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7490743Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7491309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7491768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7492129Z dist init r=0, world=2 2022-11-23T03:40:07.7492381Z dist init r=1, world=2 2022-11-23T03:40:07.7492621Z ok (4.117s) 2022-11-23T03:40:07.7493107Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7493787Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81303 2022-11-23T03:40:07.7494309Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81304 2022-11-23T03:40:07.7494922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7495354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7495922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7496389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7496944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7497445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7498019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7498479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7498911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7499416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7500069Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7500738Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7501261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7501739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7502096Z dist init r=0, world=2 2022-11-23T03:40:07.7502330Z dist init r=1, world=2 2022-11-23T03:40:07.7502568Z ok (4.116s) 2022-11-23T03:40:07.7503077Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7503740Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81446 2022-11-23T03:40:07.7504464Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81447 2022-11-23T03:40:07.7505085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7505541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7506097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7506569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7507144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7507589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7508208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7508688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7509139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7509617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7510282Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7510969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7511490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7511941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7512294Z dist init r=1, world=2 2022-11-23T03:40:07.7512549Z dist init r=0, world=2 2022-11-23T03:40:07.7512774Z ok (4.216s) 2022-11-23T03:40:07.7513276Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7513952Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81589 2022-11-23T03:40:07.7514549Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81590 2022-11-23T03:40:07.7515145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7515594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7516167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7516635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7517190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7517634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7518199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7518647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7519102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7519601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7520256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7520927Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7521448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7521918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7522279Z dist init r=0, world=2 2022-11-23T03:40:07.7522512Z dist init r=1, world=2 2022-11-23T03:40:07.7522751Z ok (4.116s) 2022-11-23T03:40:07.7523257Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7523922Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81732 2022-11-23T03:40:07.7524496Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81733 2022-11-23T03:40:07.7525116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7525567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7526120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7526593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7527168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7527612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7528160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7528624Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7529078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7529560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7530214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7530967Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7531490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7531943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7532291Z dist init r=0, world=2 2022-11-23T03:40:07.7532542Z dist init r=1, world=2 2022-11-23T03:40:07.7532768Z ok (4.217s) 2022-11-23T03:40:07.7533274Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7533953Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81875 2022-11-23T03:40:07.7534479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81876 2022-11-23T03:40:07.7535070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7535517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7536090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7536554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7537112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7537553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7538122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7538566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7539030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7539519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7540209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7540986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7541561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7542032Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7542366Z dist init r=0, world=2 2022-11-23T03:40:07.7542613Z dist init r=1, world=2 2022-11-23T03:40:07.7542850Z ok (4.216s) 2022-11-23T03:40:07.7543333Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7544303Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82018 2022-11-23T03:40:07.7544831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82019 2022-11-23T03:40:07.7545443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7545873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7546442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7546901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7547456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7547975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7548538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7548992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7549421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7549919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7550565Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7551244Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7551740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7552209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7552558Z dist init r=1, world=2 2022-11-23T03:40:07.7552788Z dist init r=0, world=2 2022-11-23T03:40:07.7553023Z ok (4.216s) 2022-11-23T03:40:07.7553523Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7554195Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82161 2022-11-23T03:40:07.7554698Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82162 2022-11-23T03:40:07.7555301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7555743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7556298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7556760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7557331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7557767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7558378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7558891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7559336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7559812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7560467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7561140Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7561657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7562103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7562452Z dist init r=0, world=2 2022-11-23T03:40:07.7562703Z dist init r=1, world=2 2022-11-23T03:40:07.7562923Z ok (4.317s) 2022-11-23T03:40:07.7563391Z test_fsdp_state_dict_keys_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82304 2022-11-23T03:40:07.7563946Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82305 2022-11-23T03:40:07.7564600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7565025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7565592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7566053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7566624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7567046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7567611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7568066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7568496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7568993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7569640Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7570320Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7570821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7571284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7572600Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7573529Z warnings.warn( 2022-11-23T03:40:07.7574737Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7575512Z warnings.warn( 2022-11-23T03:40:07.7575742Z dist init r=1, world=2 2022-11-23T03:40:07.7575986Z dist init r=0, world=2 2022-11-23T03:40:07.7576222Z ok (4.015s) 2022-11-23T03:40:07.7576673Z test_fsdp_state_dict_keys_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82447 2022-11-23T03:40:07.7577238Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82448 2022-11-23T03:40:07.7577842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7578283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7578832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7579301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7579871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7580294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7580858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7581372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7581821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7582299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7582946Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7583627Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7584343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7584796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7586051Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7586828Z warnings.warn( 2022-11-23T03:40:07.7587970Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7588731Z warnings.warn( 2022-11-23T03:40:07.7588963Z dist init r=0, world=2 2022-11-23T03:40:07.7589208Z dist init r=1, world=2 2022-11-23T03:40:07.7589444Z ok (4.015s) 2022-11-23T03:40:07.7589888Z test_fsdp_state_dict_keys_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82590 2022-11-23T03:40:07.7590438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82591 2022-11-23T03:40:07.7591042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7591560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7592124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7592587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7593152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7593580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7594142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7594599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7595046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7595529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7596175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7596569Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7596796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7597086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7598108Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7598220Z warnings.warn( 2022-11-23T03:40:07.7599206Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7599318Z warnings.warn( 2022-11-23T03:40:07.7599411Z dist init r=1, world=2 2022-11-23T03:40:07.7599518Z dist init r=0, world=2 2022-11-23T03:40:07.7599618Z ok (4.114s) 2022-11-23T03:40:07.7599957Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7600403Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82733 2022-11-23T03:40:07.7600620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82734 2022-11-23T03:40:07.7600986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7601168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7601527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7601716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7602075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7602247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7602679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7602874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7603117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7603357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7603759Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7604132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7604358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7604587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7605208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7605319Z warnings.warn( 2022-11-23T03:40:07.7605930Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7606088Z warnings.warn( 2022-11-23T03:40:07.7606197Z dist init r=0, world=2 2022-11-23T03:40:07.7606304Z dist init r=1, world=2 2022-11-23T03:40:07.7606385Z ok (4.215s) 2022-11-23T03:40:07.7606724Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7607171Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82876 2022-11-23T03:40:07.7607388Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82877 2022-11-23T03:40:07.7607754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7607929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7608304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7608491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7608834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7609005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7609381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7609568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7609810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7610049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7610444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7610832Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7611058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7611266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7611931Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7612049Z warnings.warn( 2022-11-23T03:40:07.7612662Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7612782Z warnings.warn( 2022-11-23T03:40:07.7612885Z dist init r=0, world=2 2022-11-23T03:40:07.7612991Z dist init r=1, world=2 2022-11-23T03:40:07.7613088Z ok (4.117s) 2022-11-23T03:40:07.7613412Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7613843Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83019 2022-11-23T03:40:07.7614059Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83020 2022-11-23T03:40:07.7614420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7614594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7615024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7615213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7615576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7615747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7616116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7616288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7616531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7616770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7617166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7617556Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7617781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7618004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7618616Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7618726Z warnings.warn( 2022-11-23T03:40:07.7619317Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7619431Z warnings.warn( 2022-11-23T03:40:07.7619539Z dist init r=1, world=2 2022-11-23T03:40:07.7619643Z dist init r=0, world=2 2022-11-23T03:40:07.7619740Z ok (4.116s) 2022-11-23T03:40:07.7620062Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7620554Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83162 2022-11-23T03:40:07.7620776Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83163 2022-11-23T03:40:07.7621124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7621298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7621674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7621862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7622225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7622395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7622763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7622950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7623191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7623414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7624049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7624461Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7624742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7624970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7625594Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7625708Z warnings.warn( 2022-11-23T03:40:07.7626314Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7626426Z warnings.warn( 2022-11-23T03:40:07.7626518Z dist init r=1, world=2 2022-11-23T03:40:07.7626625Z dist init r=0, world=2 2022-11-23T03:40:07.7626722Z ok (4.117s) 2022-11-23T03:40:07.7627049Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7627494Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83305 2022-11-23T03:40:07.7627712Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83306 2022-11-23T03:40:07.7628077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7628251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7628612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7628806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7629166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7629336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7629774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7629973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7630218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7630461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7630859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7631239Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7631468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7631694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7632319Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7632431Z warnings.warn( 2022-11-23T03:40:07.7633042Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7633216Z warnings.warn( 2022-11-23T03:40:07.7633326Z dist init r=0, world=2 2022-11-23T03:40:07.7633431Z dist init r=1, world=2 2022-11-23T03:40:07.7633512Z ok (4.116s) 2022-11-23T03:40:07.7633835Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7634284Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83448 2022-11-23T03:40:07.7634501Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83449 2022-11-23T03:40:07.7634863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7635036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7635410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7635603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7635948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7636121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7636486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7636676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7636920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7637161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7637555Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7637948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7638176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7638386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7639045Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7639165Z warnings.warn( 2022-11-23T03:40:07.7639834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7639948Z warnings.warn( 2022-11-23T03:40:07.7640059Z dist init r=1, world=2 2022-11-23T03:40:07.7640166Z dist init r=0, world=2 2022-11-23T03:40:07.7640265Z ok (4.216s) 2022-11-23T03:40:07.7640610Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7641039Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83591 2022-11-23T03:40:07.7641259Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83592 2022-11-23T03:40:07.7641626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7641800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7642174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7642449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7642810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7642980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7643351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7643545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7643814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7644057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7644452Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7644844Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7645069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7645293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7645906Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7646019Z warnings.warn( 2022-11-23T03:40:07.7646601Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7646714Z warnings.warn( 2022-11-23T03:40:07.7646823Z dist init r=0, world=2 2022-11-23T03:40:07.7646931Z dist init r=1, world=2 2022-11-23T03:40:07.7647028Z ok (4.117s) 2022-11-23T03:40:07.7647368Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7647809Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83734 2022-11-23T03:40:07.7648072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83735 2022-11-23T03:40:07.7648429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7648603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7648978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7649174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7649531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7649707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7650071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7650261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7650504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7650730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7651123Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7651569Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7651795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7652017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7652629Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7652740Z warnings.warn( 2022-11-23T03:40:07.7653353Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7653465Z warnings.warn( 2022-11-23T03:40:07.7653558Z dist init r=0, world=2 2022-11-23T03:40:07.7653664Z dist init r=1, world=2 2022-11-23T03:40:07.7653761Z ok (4.116s) 2022-11-23T03:40:07.7654088Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7654529Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83877 2022-11-23T03:40:07.7654749Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83878 2022-11-23T03:40:07.7655113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7655285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7655639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7655834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7656191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7656362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7656730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7656915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7657208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7657455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7657849Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7658225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7658512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7658736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7659348Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7659459Z warnings.warn( 2022-11-23T03:40:07.7660067Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7660173Z warnings.warn( 2022-11-23T03:40:07.7660333Z dist init r=1, world=2 2022-11-23T03:40:07.7660440Z dist init r=0, world=2 2022-11-23T03:40:07.7660521Z ok (4.116s) 2022-11-23T03:40:07.7660847Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7661294Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84020 2022-11-23T03:40:07.7661514Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84021 2022-11-23T03:40:07.7661882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7662054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7662428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7662617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7662978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7663131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7663503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7663690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7664205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7664460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7664860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7665254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7665482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7665689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7666398Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7666518Z warnings.warn( 2022-11-23T03:40:07.7667133Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7667239Z warnings.warn( 2022-11-23T03:40:07.7667351Z dist init r=0, world=2 2022-11-23T03:40:07.7667458Z dist init r=1, world=2 2022-11-23T03:40:07.7667556Z ok (4.219s) 2022-11-23T03:40:07.7667887Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7668311Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84163 2022-11-23T03:40:07.7668528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84164 2022-11-23T03:40:07.7668896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7669071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7669443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7669697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7670059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7670231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7670600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7670769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7671013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7671255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7671648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7672041Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7672271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7672497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7672608Z dist init r=0, world=2 2022-11-23T03:40:07.7672697Z dist init r=1, world=2 2022-11-23T03:40:07.7672795Z ok (4.116s) 2022-11-23T03:40:07.7673126Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7673570Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84306 2022-11-23T03:40:07.7673787Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84307 2022-11-23T03:40:07.7674150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7674326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7674700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7674889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7675235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7675453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7675832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7676019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7676262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7676503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7676894Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7677286Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7677513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7677726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7677836Z dist init r=1, world=2 2022-11-23T03:40:07.7677943Z dist init r=0, world=2 2022-11-23T03:40:07.7678045Z ok (4.017s) 2022-11-23T03:40:07.7678367Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7678864Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84449 2022-11-23T03:40:07.7679081Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84450 2022-11-23T03:40:07.7679445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7679601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7679980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7680167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7680526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7680696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7681067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7681252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7681491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7681716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7682112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7682504Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7682731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7682953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7683067Z dist init r=1, world=2 2022-11-23T03:40:07.7683175Z dist init r=0, world=2 2022-11-23T03:40:07.7683274Z ok (4.117s) 2022-11-23T03:40:07.7683573Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7684011Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84592 2022-11-23T03:40:07.7684274Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84593 2022-11-23T03:40:07.7684649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7684819Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7685190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7685386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7685749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7685925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7686276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7686465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7686712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7686957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7687353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7687813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7688042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7688272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7688388Z dist init r=0, world=2 2022-11-23T03:40:07.7688480Z dist init r=1, world=2 2022-11-23T03:40:07.7688583Z ok (4.116s) 2022-11-23T03:40:07.7688905Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7689346Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84735 2022-11-23T03:40:07.7689560Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84736 2022-11-23T03:40:07.7689928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7690102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7690474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7690649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7691017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7691190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7691561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7691751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7691992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7692237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7692630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7693018Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7693277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7693510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7693620Z dist init r=1, world=2 2022-11-23T03:40:07.7693727Z dist init r=0, world=2 2022-11-23T03:40:07.7693827Z ok (4.116s) 2022-11-23T03:40:07.7694145Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7694591Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84878 2022-11-23T03:40:07.7694808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84879 2022-11-23T03:40:07.7695154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7695332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7695708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7695896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7696257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7696428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7696846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7697033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7697273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7697495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7697892Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7698281Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7698508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7698735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7698847Z dist init r=1, world=2 2022-11-23T03:40:07.7698953Z dist init r=0, world=2 2022-11-23T03:40:07.7699050Z ok (4.116s) 2022-11-23T03:40:07.7699370Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7699812Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85021 2022-11-23T03:40:07.7700031Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85022 2022-11-23T03:40:07.7700396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7700569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7700942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7701133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7701493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7701664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7702014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7702246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7702493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7702732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7703123Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7703515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7703741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7704197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7704316Z dist init r=0, world=2 2022-11-23T03:40:07.7704406Z dist init r=1, world=2 2022-11-23T03:40:07.7704505Z ok (4.016s) 2022-11-23T03:40:07.7704839Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7705284Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85164 2022-11-23T03:40:07.7705505Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85165 2022-11-23T03:40:07.7705949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7706120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7706497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7706667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7707031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7707205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7707572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7707757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7708001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7708240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7708633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7709022Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7709238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7709464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7709577Z dist init r=0, world=2 2022-11-23T03:40:07.7709685Z dist init r=1, world=2 2022-11-23T03:40:07.7709786Z ok (4.116s) 2022-11-23T03:40:07.7710107Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7710554Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85307 2022-11-23T03:40:07.7710772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85308 2022-11-23T03:40:07.7711123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7711357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7711744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7711933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7712293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7712468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7712835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7713021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7713246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7713491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7713885Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7714272Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7715940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7716256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7716369Z dist init r=0, world=2 2022-11-23T03:40:07.7716477Z dist init r=1, world=2 2022-11-23T03:40:07.7716575Z ok (4.117s) 2022-11-23T03:40:07.7716882Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7717355Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85450 2022-11-23T03:40:07.7717573Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85451 2022-11-23T03:40:07.7717940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7718113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7718490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7718678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7719038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7719192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7719562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7719747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7719988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7720229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7720621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7721017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7721245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7721470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7721562Z dist init r=0, world=2 2022-11-23T03:40:07.7721669Z dist init r=1, world=2 2022-11-23T03:40:07.7721823Z ok (4.016s) 2022-11-23T03:40:07.7722151Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7722465Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85593 2022-11-23T03:40:07.7722685Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85594 2022-11-23T03:40:07.7723056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7723237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7723696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7723883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7724247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7724473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7724847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7725083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7725323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7725563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7725960Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7726338Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7726565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7726787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7726897Z dist init r=0, world=2 2022-11-23T03:40:07.7727004Z dist init r=1, world=2 2022-11-23T03:40:07.7727101Z ok (4.618s) 2022-11-23T03:40:07.7727427Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7727740Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85744 2022-11-23T03:40:07.7727938Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85745 2022-11-23T03:40:07.7728308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7728483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7728858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7729044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7729406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7729581Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7729951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7730137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7730361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7730651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7731054Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7731443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7731672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7731896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7732005Z dist init r=0, world=2 2022-11-23T03:40:07.7732113Z dist init r=1, world=2 2022-11-23T03:40:07.7732195Z ok (4.016s) 2022-11-23T03:40:07.7732518Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7732836Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85887 2022-11-23T03:40:07.7733050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85888 2022-11-23T03:40:07.7733415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7733638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7734013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7734202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7734565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7734720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7735091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7735276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7735516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7735759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7736154Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7736542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7736768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7736992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7737090Z dist init r=1, world=2 2022-11-23T03:40:07.7737198Z dist init r=0, world=2 2022-11-23T03:40:07.7737296Z ok (4.719s) 2022-11-23T03:40:07.7737618Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7737930Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86038 2022-11-23T03:40:07.7738150Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86039 2022-11-23T03:40:07.7738516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7738693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7739047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7739287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7739713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7739890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7740260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7740455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7740701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7740946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7741344Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7741720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7741950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7742180Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7742294Z dist init r=0, world=2 2022-11-23T03:40:07.7742459Z dist init r=1, world=2 2022-11-23T03:40:07.7742560Z ok (4.117s) 2022-11-23T03:40:07.7742894Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7743212Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86181 2022-11-23T03:40:07.7743409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86182 2022-11-23T03:40:07.7743828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7744378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7744772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7744964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7745335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7745516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7745895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7746084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7746313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7746553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7746941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7747329Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7747555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7747779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7747888Z dist init r=1, world=2 2022-11-23T03:40:07.7747994Z dist init r=0, world=2 2022-11-23T03:40:07.7748075Z ok (4.618s) 2022-11-23T03:40:07.7748490Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7748820Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86332 2022-11-23T03:40:07.7749041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86333 2022-11-23T03:40:07.7749419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7749601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7749979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7750174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7750542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7750700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7751071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7751261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7751505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7751834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7752235Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7752627Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7752856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7753092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7753185Z dist init r=0, world=2 2022-11-23T03:40:07.7753297Z dist init r=1, world=2 2022-11-23T03:40:07.7753398Z ok (4.017s) 2022-11-23T03:40:07.7753732Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7754054Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86475 2022-11-23T03:40:07.7754273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86476 2022-11-23T03:40:07.7754645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7754824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7755185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7755378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7755745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7755920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7756300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7756493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7756736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7756982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7757408Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7757816Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7758047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7758278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7758395Z dist init r=1, world=2 2022-11-23T03:40:07.7758505Z dist init r=0, world=2 2022-11-23T03:40:07.7758607Z ok (4.618s) 2022-11-23T03:40:07.7758934Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7759247Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86626 2022-11-23T03:40:07.7759448Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86627 2022-11-23T03:40:07.7759820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7759997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7760376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7760619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7760987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7761162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7761532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7761699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7761946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7762188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7762582Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7762978Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7763205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7763435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7763548Z dist init r=1, world=2 2022-11-23T03:40:07.7763660Z dist init r=0, world=2 2022-11-23T03:40:07.7763742Z ok (4.016s) 2022-11-23T03:40:07.7764066Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7764384Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86769 2022-11-23T03:40:07.7764603Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86770 2022-11-23T03:40:07.7764979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7765154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7765533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7765729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7766074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7766295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7766683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7766873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7767118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7767372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7767771Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7768167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7768399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7768608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7768725Z dist init r=1, world=2 2022-11-23T03:40:07.7768835Z dist init r=0, world=2 2022-11-23T03:40:07.7768936Z ok (4.618s) 2022-11-23T03:40:07.7769256Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7769629Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86920 2022-11-23T03:40:07.7769851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86921 2022-11-23T03:40:07.7770224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7770380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7770762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7770954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7771321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7771499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7771874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7772064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7772308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7772552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7772932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7773329Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7773558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7773782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7773902Z dist init r=0, world=2 2022-11-23T03:40:07.7774013Z dist init r=1, world=2 2022-11-23T03:40:07.7774117Z ok (4.518s) 2022-11-23T03:40:07.7774431Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7774729Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87071 2022-11-23T03:40:07.7774999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87072 2022-11-23T03:40:07.7775384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7775563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7775936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7776131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7776493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7776670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7777043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7777214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7777461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7777706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7778102Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7778547Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7778769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7778994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7779103Z dist init r=0, world=2 2022-11-23T03:40:07.7779193Z dist init r=1, world=2 2022-11-23T03:40:07.7779291Z ok (4.618s) 2022-11-23T03:40:07.7779609Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7779922Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87222 2022-11-23T03:40:07.7780135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87223 2022-11-23T03:40:07.7780584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7780756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7781134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7781322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7781667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7781838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7782214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7782394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7782639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7782876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7783266Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7783653Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7784216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7784446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7784558Z dist init r=0, world=2 2022-11-23T03:40:07.7784667Z dist init r=1, world=2 2022-11-23T03:40:07.7784768Z ok (4.618s) 2022-11-23T03:40:07.7785082Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7785521Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87373 2022-11-23T03:40:07.7785738Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87374 2022-11-23T03:40:07.7786100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7786258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7786636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7786826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7787186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7787422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7787791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7787975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7788215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7788439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7788835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7789225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7789451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7789677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7790294Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7790405Z warnings.warn( 2022-11-23T03:40:07.7791017Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7791132Z warnings.warn( 2022-11-23T03:40:07.7791237Z dist init r=0, world=2 2022-11-23T03:40:07.7791329Z dist init r=1, world=2 2022-11-23T03:40:07.7791428Z ok (4.718s) 2022-11-23T03:40:07.7791737Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7792165Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87524 2022-11-23T03:40:07.7792384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87525 2022-11-23T03:40:07.7792746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7792920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7793340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7793517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7793876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7794047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7794417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7794605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7794928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7795170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7795566Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7795955Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7796167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7796386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7797462Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7797575Z warnings.warn( 2022-11-23T03:40:07.7798572Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7798685Z warnings.warn( 2022-11-23T03:40:07.7799295Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7799404Z warnings.warn( 2022-11-23T03:40:07.7800016Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7800125Z warnings.warn( 2022-11-23T03:40:07.7800234Z dist init r=1, world=2 2022-11-23T03:40:07.7800324Z dist init r=0, world=2 2022-11-23T03:40:07.7800422Z ok (4.618s) 2022-11-23T03:40:07.7800737Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7801168Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87675 2022-11-23T03:40:07.7801385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87676 2022-11-23T03:40:07.7801749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7801924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7802340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7802518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7802881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7803054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7803426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7803612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7803853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7804094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7804490Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7804884Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7805094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7805316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7805476Z dist init r=0, world=2 2022-11-23T03:40:07.7805585Z dist init r=1, world=2 2022-11-23T03:40:07.7805685Z ok (4.017s) 2022-11-23T03:40:07.7805992Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7806417Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87818 2022-11-23T03:40:07.7806635Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87819 2022-11-23T03:40:07.7806980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7807153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7807525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7807715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7808071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7808242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7808606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7808793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7809034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7809256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7809649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7810043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7810269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7810520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7810631Z dist init r=0, world=2 2022-11-23T03:40:07.7810737Z dist init r=1, world=2 2022-11-23T03:40:07.7810834Z ok (4.118s) 2022-11-23T03:40:07.7811168Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7811601Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87961 2022-11-23T03:40:07.7811815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87962 2022-11-23T03:40:07.7812179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7812352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7812725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7812914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7813273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7813444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7813795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7813984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7814223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7814529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7814922Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7815314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7815547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7815774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7815885Z dist init r=1, world=2 2022-11-23T03:40:07.7815977Z dist init r=0, world=2 2022-11-23T03:40:07.7816074Z ok (4.718s) 2022-11-23T03:40:07.7816371Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7816794Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88112 2022-11-23T03:40:07.7817011Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88113 2022-11-23T03:40:07.7817379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7817556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7817919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7818078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7818458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7818646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7819018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7819206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7819445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7819685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7820121Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7820518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7820729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7820951Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7821968Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7822079Z warnings.warn( 2022-11-23T03:40:07.7823072Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7823223Z warnings.warn( 2022-11-23T03:40:07.7823332Z dist init r=0, world=2 2022-11-23T03:40:07.7823440Z dist init r=1, world=2 2022-11-23T03:40:07.7823538Z ok (4.618s) 2022-11-23T03:40:07.7823835Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7824518Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88263 2022-11-23T03:40:07.7824739Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88264 2022-11-23T03:40:07.7825108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7825287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7825659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7825851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7826213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7826385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7826754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7826926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7827168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7827413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7827804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7828197Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7828423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7828647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7828757Z dist init r=1, world=2 2022-11-23T03:40:07.7828847Z dist init r=0, world=2 2022-11-23T03:40:07.7828944Z ok (4.519s) 2022-11-23T03:40:07.7829311Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7829750Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88414 2022-11-23T03:40:07.7829964Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88415 2022-11-23T03:40:07.7830332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7830504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7830877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7831068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7831413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7831586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7831950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7832135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7832438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7832673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7833065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7833458Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7833669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7833899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7834903Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7835019Z warnings.warn( 2022-11-23T03:40:07.7836009Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7836117Z warnings.warn( 2022-11-23T03:40:07.7836227Z dist init r=0, world=2 2022-11-23T03:40:07.7836334Z dist init r=1, world=2 2022-11-23T03:40:07.7836433Z ok (4.619s) 2022-11-23T03:40:07.7836670Z test_state_dict_rank0_offload_save_load_flow_use_orig_params_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7836974Z Tests saving a model checkpoint only on rank 0 and loading it only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88565 2022-11-23T03:40:07.7837173Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88566 2022-11-23T03:40:07.7837543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7837763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7838146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7838335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7838695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7838871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7839238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7839406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7839650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7839948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7840347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7840835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7841066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7841368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7841479Z dist init r=1, world=2 2022-11-23T03:40:07.7841586Z dist init r=0, world=2 2022-11-23T03:40:07.7841666Z ok (4.317s) 2022-11-23T03:40:07.7841901Z test_state_dict_rank0_offload_save_load_flow_use_orig_params_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7842205Z Tests saving a model checkpoint only on rank 0 and loading it only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88708 2022-11-23T03:40:07.7842422Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88709 2022-11-23T03:40:07.7842794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7842966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7843337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7843529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7843931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7844109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7844484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7844676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7844917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7845156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7845546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7845944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7846171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7846378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7846489Z dist init r=0, world=2 2022-11-23T03:40:07.7846597Z dist init r=1, world=2 2022-11-23T03:40:07.7846695Z ok (4.317s) 2022-11-23T03:40:07.7847089Z test_state_dict_save_load_flow_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88851 2022-11-23T03:40:07.7847311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88852 2022-11-23T03:40:07.7847672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7847850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7848205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7848396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7848750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7848921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7849288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7849473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7849715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7849956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7850409Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7850780Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7851006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7851230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7852246Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7852359Z warnings.warn( 2022-11-23T03:40:07.7853346Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7853455Z warnings.warn( 2022-11-23T03:40:07.7853690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7853921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7854150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7854382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7854475Z dist init r=0, world=2 2022-11-23T03:40:07.7854581Z dist init r=1, world=2 2022-11-23T03:40:07.7854681Z ok (4.717s) 2022-11-23T03:40:07.7855030Z test_state_dict_save_load_flow_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89002 2022-11-23T03:40:07.7855246Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89003 2022-11-23T03:40:07.7855658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7855838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7856216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7856389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7856753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7856925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7857295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7857481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7857724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7857964Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7858362Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7858736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7859016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7859241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7860251Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7860364Z warnings.warn( 2022-11-23T03:40:07.7861351Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7861462Z warnings.warn( 2022-11-23T03:40:07.7861694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7861928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7862163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7862392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7862484Z dist init r=1, world=2 2022-11-23T03:40:07.7862592Z dist init r=0, world=2 2022-11-23T03:40:07.7862689Z ok (4.617s) 2022-11-23T03:40:07.7863024Z test_state_dict_save_load_flow_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89153 2022-11-23T03:40:07.7863243Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89154 2022-11-23T03:40:07.7863616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7863788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7864445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7864628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7864993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7865165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7865537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7865729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7865976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7866216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7866614Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7867010Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7867220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7867446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7868533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7868647Z warnings.warn( 2022-11-23T03:40:07.7869643Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7869753Z warnings.warn( 2022-11-23T03:40:07.7869985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7870211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7870441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7870673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:40:07.7870784Z dist init r=0, world=2 2022-11-23T03:40:07.7870876Z dist init r=1, world=2 2022-11-23T03:40:07.7870981Z ok (4.619s) 2022-11-23T03:40:07.7871348Z test_state_dict_skip_module_state_dict_type_local_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89304 2022-11-23T03:40:07.7871565Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89305 2022-11-23T03:40:07.7871933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7872192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7872576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7872766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7873106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7873336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7873719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7873906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7874148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7874393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7874787Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7875177Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7875408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7875620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7875730Z dist init r=1, world=2 2022-11-23T03:40:07.7875837Z dist init r=0, world=2 2022-11-23T03:40:07.7875935Z ok (4.719s) 2022-11-23T03:40:07.7876300Z test_state_dict_skip_module_state_dict_type_sharded_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89455 2022-11-23T03:40:07.7876579Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89456 2022-11-23T03:40:07.7876951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7877125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7877482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7877674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7878033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7878204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7878571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7878759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7879001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7879241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7879635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7880012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7880240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7880468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7881086Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7881205Z warnings.warn( 2022-11-23T03:40:07.7881815Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7881923Z warnings.warn( 2022-11-23T03:40:07.7882034Z dist init r=0, world=2 2022-11-23T03:40:07.7882190Z dist init r=1, world=2 2022-11-23T03:40:07.7882277Z ok (4.720s) 2022-11-23T03:40:07.7882634Z test_state_dict_skip_module_state_dict_type_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89606 2022-11-23T03:40:07.7882850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89607 2022-11-23T03:40:07.7883220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7883392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7883765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7883953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7884312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7884469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7884840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7885026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7885265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7885556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7885950Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7886337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7886562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7886791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7886885Z dist init r=1, world=2 2022-11-23T03:40:07.7886992Z dist init r=0, world=2 2022-11-23T03:40:07.7887090Z ok (4.618s) 2022-11-23T03:40:07.7887382Z test_state_dict_type (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89757 2022-11-23T03:40:07.7887600Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89758 2022-11-23T03:40:07.7887966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7888137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7888509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7888681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7889042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7889215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7889588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7889776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7890016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7890256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7890647Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7891093Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7891312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7891539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7891650Z dist init r=1, world=2 2022-11-23T03:40:07.7891757Z dist init r=0, world=2 2022-11-23T03:40:07.7891859Z ok (4.116s) 2022-11-23T03:40:07.7892255Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89900 2022-11-23T03:40:07.7892471Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89901 2022-11-23T03:40:07.7892842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7892999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7893376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7893567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7893926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7894154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7894526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7894711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7894953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7895174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7895571Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7895965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7896193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7896418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7897424Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7897539Z warnings.warn( 2022-11-23T03:40:07.7898533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7898644Z warnings.warn( 2022-11-23T03:40:07.7899257Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7899367Z warnings.warn( 2022-11-23T03:40:07.7900023Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7900118Z warnings.warn( 2022-11-23T03:40:07.7900227Z dist init r=1, world=2 2022-11-23T03:40:07.7900335Z dist init r=0, world=2 2022-11-23T03:40:07.7900435Z ok (4.117s) 2022-11-23T03:40:07.7900828Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90043 2022-11-23T03:40:07.7901047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90044 2022-11-23T03:40:07.7901414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7901587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7901946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7902135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7902494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7902667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7903034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7903268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7903510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7903751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7904444Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7904826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7905054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7905278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7905995Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7906107Z warnings.warn( 2022-11-23T03:40:07.7906802Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7906912Z warnings.warn( 2022-11-23T03:40:07.7907022Z dist init r=1, world=2 2022-11-23T03:40:07.7907128Z dist init r=0, world=2 2022-11-23T03:40:07.7907209Z ok (4.017s) 2022-11-23T03:40:07.7907604Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90186 2022-11-23T03:40:07.7907825Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90187 2022-11-23T03:40:07.7908193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7908366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7908738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7909001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7909435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7909607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7909962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7910156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7910401Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7910642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7911033Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7911427Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7911771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7911997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7913008Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7913181Z warnings.warn( 2022-11-23T03:40:07.7914198Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7914291Z warnings.warn( 2022-11-23T03:40:07.7914905Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7915013Z warnings.warn( 2022-11-23T03:40:07.7915624Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:40:07.7915731Z warnings.warn( 2022-11-23T03:40:07.7915842Z dist init r=1, world=2 2022-11-23T03:40:07.7915950Z dist init r=0, world=2 2022-11-23T03:40:07.7916047Z ok (4.017s) 2022-11-23T03:40:07.7916440Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90329 2022-11-23T03:40:07.7916646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90330 2022-11-23T03:40:07.7917014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7917185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7917562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7917750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7918153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7918334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7918707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7918874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7919122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7919363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7919756Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7920149Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7920380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7920604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7921311Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7921514Z warnings.warn( 2022-11-23T03:40:07.7922224Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7922333Z warnings.warn( 2022-11-23T03:40:07.7922425Z dist init r=0, world=2 2022-11-23T03:40:07.7922534Z dist init r=1, world=2 2022-11-23T03:40:07.7922632Z ok (4.117s) 2022-11-23T03:40:07.7923019Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90472 2022-11-23T03:40:07.7923235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90473 2022-11-23T03:40:07.7923606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7923780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7924135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7924325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7924687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7924859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7925224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7925408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7925652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7925893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7926288Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7926661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7926931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7927162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7928177Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7928293Z warnings.warn( 2022-11-23T03:40:07.7929285Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7929391Z warnings.warn( 2022-11-23T03:40:07.7929499Z dist init r=0, world=2 2022-11-23T03:40:07.7929606Z dist init r=1, world=2 2022-11-23T03:40:07.7929703Z ok (4.117s) 2022-11-23T03:40:07.7930088Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90615 2022-11-23T03:40:07.7930338Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90616 2022-11-23T03:40:07.7930711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7930884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7931260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7931449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7931803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7931974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7932343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7932512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7932753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7932992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7933390Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7933781Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7934008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7934230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7934939Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7935053Z warnings.warn( 2022-11-23T03:40:07.7935802Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7935916Z warnings.warn( 2022-11-23T03:40:07.7936012Z dist init r=0, world=2 2022-11-23T03:40:07.7936120Z dist init r=1, world=2 2022-11-23T03:40:07.7936219Z ok (4.117s) 2022-11-23T03:40:07.7936603Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90758 2022-11-23T03:40:07.7936823Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90759 2022-11-23T03:40:07.7937194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7937368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7937727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7937916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7938275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7938445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7938813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7939068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7939309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7939547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7939997Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7940380Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7940608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7940831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7941844Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7941955Z warnings.warn( 2022-11-23T03:40:07.7942961Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7943073Z warnings.warn( 2022-11-23T03:40:07.7943182Z dist init r=1, world=2 2022-11-23T03:40:07.7943289Z dist init r=0, world=2 2022-11-23T03:40:07.7943387Z ok (4.117s) 2022-11-23T03:40:07.7943814Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90901 2022-11-23T03:40:07.7944322Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90902 2022-11-23T03:40:07.7944785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7944973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7945351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7945541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7945907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7946080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7946448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7946618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7946862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7947107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7947502Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7947896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7948183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7948408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7949118Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7949236Z warnings.warn( 2022-11-23T03:40:07.7949942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:386: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-11-23T03:40:07.7950033Z warnings.warn( 2022-11-23T03:40:07.7950149Z dist init r=1, world=2 2022-11-23T03:40:07.7950257Z dist init r=0, world=2 2022-11-23T03:40:07.7950358Z ok (4.017s) 2022-11-23T03:40:07.7950643Z test_state_dict_with_manual_ac_wrapper_state_dict_type_sharded_state_dict_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7950959Z Tests saving and loading a state dict for a model manually wrapped with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91044 2022-11-23T03:40:07.7951176Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91045 2022-11-23T03:40:07.7951545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7951704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7952079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7952268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7952631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7952803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7953171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7953356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7953652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7953901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7954278Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7954667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7954898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7955123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7955233Z dist init r=0, world=2 2022-11-23T03:40:07.7955342Z dist init r=1, world=2 2022-11-23T03:40:07.7955440Z ok (4.216s) 2022-11-23T03:40:07.7955726Z test_state_dict_with_manual_ac_wrapper_state_dict_type_sharded_state_dict_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7956030Z Tests saving and loading a state dict for a model manually wrapped with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91187 2022-11-23T03:40:07.7956248Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91188 2022-11-23T03:40:07.7956618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7956839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7957216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7957405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7957765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7957938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7958312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7958485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7958724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7958968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7959363Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7959752Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7959978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7960206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7960315Z dist init r=1, world=2 2022-11-23T03:40:07.7960406Z dist init r=0, world=2 2022-11-23T03:40:07.7960504Z ok (4.015s) 2022-11-23T03:40:07.7960777Z test_state_dict_with_manual_ac_wrapper_state_dict_type_state_dict_rank0_only_and_offload_False (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7961088Z Tests saving and loading a state dict for a model manually wrapped with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91330 2022-11-23T03:40:07.7961307Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91331 2022-11-23T03:40:07.7961676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7961850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7962221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7962457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7962810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7962982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7963349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7963538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7963780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7964017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7964411Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7964805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7965034Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7965242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7965353Z dist init r=0, world=2 2022-11-23T03:40:07.7965508Z dist init r=1, world=2 2022-11-23T03:40:07.7965608Z ok (4.217s) 2022-11-23T03:40:07.7965880Z test_state_dict_with_manual_ac_wrapper_state_dict_type_state_dict_rank0_only_and_offload_True (__main__.TestFSDPStateDict) 2022-11-23T03:40:07.7966191Z Tests saving and loading a state dict for a model manually wrapped with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91473 2022-11-23T03:40:07.7966407Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91474 2022-11-23T03:40:07.7966780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7966937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7967307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7967495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7967855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7968025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7968391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7968576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7968822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7969047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7969438Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7969826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7970058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7970283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7970393Z dist init r=0, world=2 2022-11-23T03:40:07.7970500Z dist init r=1, world=2 2022-11-23T03:40:07.7970597Z ok (4.217s) 2022-11-23T03:40:07.7970984Z test_state_dict_with_shared_parameters_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91616 2022-11-23T03:40:07.7971207Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91617 2022-11-23T03:40:07.7971577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7971749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7972122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7972312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7972673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7972843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7973215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7973387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7973628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7973865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7974258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7974700Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7974925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7975149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7975258Z dist init r=0, world=2 2022-11-23T03:40:07.7975348Z dist init r=1, world=2 2022-11-23T03:40:07.7975449Z ok (4.116s) 2022-11-23T03:40:07.7975811Z test_state_dict_with_shared_parameters_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91759 2022-11-23T03:40:07.7976026Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91760 2022-11-23T03:40:07.7976391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7976565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7976940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7977127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7977479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7977637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7978006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7978193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7978434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7978681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7979073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7979456Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7979683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7979947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7980049Z dist init r=1, world=2 2022-11-23T03:40:07.7980159Z dist init r=0, world=2 2022-11-23T03:40:07.7980256Z ok (4.215s) 2022-11-23T03:40:07.7980609Z test_state_dict_with_shared_parameters_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91902 2022-11-23T03:40:07.7980831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91903 2022-11-23T03:40:07.7981199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7981371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7981744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7981916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7982276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7982448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7982814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7983042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7983281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7983518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7984161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7984558Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7984786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7985010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7985120Z dist init r=0, world=2 2022-11-23T03:40:07.7985232Z dist init r=1, world=2 2022-11-23T03:40:07.7985325Z ok (4.216s) 2022-11-23T03:40:07.7985630Z test_wrong_state_dict_config (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92045 2022-11-23T03:40:07.7985844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92046 2022-11-23T03:40:07.7986193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7986366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7986743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7986931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7987292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:40:07.7987463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:40:07.7987834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:40:07.7988018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:40:07.7988262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:40:07.7988484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:40:07.7988947Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7989357Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:40:07.7989586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:40:07.7989807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:40:07.7990830Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7990941Z warnings.warn( 2022-11-23T03:40:07.7991935Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:40:07.7992103Z warnings.warn( 2022-11-23T03:40:07.7992211Z dist init r=0, world=2 2022-11-23T03:40:07.7992318Z dist init r=1, world=2 2022-11-23T03:40:07.7992399Z ok (4.216s) 2022-11-23T03:40:07.7992424Z 2022-11-23T03:40:07.7992695Z ---------------------------------------------------------------------- 2022-11-23T03:40:07.7992810Z Ran 116 tests in 490.883s 2022-11-23T03:40:07.7992830Z 2022-11-23T03:40:07.7992920Z OK 2022-11-23T03:40:07.7992939Z 2022-11-23T03:40:07.7993060Z Generating XML reports... 2022-11-23T03:40:07.7993509Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20221123033156.xml 2022-11-23T03:40:07.7993530Z 2022-11-23T03:40:07.7994117Z ##[endgroup] 2022-11-23T03:40:07.7994579Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_state_dict (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_state_dict_aav7zu6g) 2022-11-23T03:40:07.7994616Z 2022-11-23T03:40:08.1156774Z 2022-11-23T03:40:08.1157162Z real 8m18.968s 2022-11-23T03:40:08.1157312Z user 16m55.488s 2022-11-23T03:40:08.1157427Z sys 14m6.756s 2022-11-23T03:40:08.1157589Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:40:08.1158059Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_summon_full_params.py 2022-11-23T03:40:10.5038531Z Ignoring disabled issues: [] 2022-11-23T03:40:10.5573537Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:40:10.5574129Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:40:10.5574462Z Selected tests: 2022-11-23T03:40:10.5574769Z distributed/fsdp/test_fsdp_summon_full_params.py 2022-11-23T03:40:10.5602745Z Prioritized test from test file changes. 2022-11-23T03:40:10.5603069Z reordering tests for PR: 2022-11-23T03:40:10.5603347Z prioritized: [] 2022-11-23T03:40:10.5603859Z the rest: ['distributed/fsdp/test_fsdp_summon_full_params.py'] 2022-11-23T03:40:10.5604104Z 2022-11-23T03:40:10.5604628Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:40:10.5605542Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:40:10.5612608Z parallel (file granularity) tests: 2022-11-23T03:40:10.5613031Z 2022-11-23T03:40:10.5613285Z serial (file granularity) tests: 2022-11-23T03:40:10.5613897Z distributed/fsdp/test_fsdp_summon_full_params.py 2022-11-23T03:40:12.9146679Z Ignoring disabled issues: [] 2022-11-23T03:40:13.3451426Z Running distributed/fsdp/test_fsdp_summon_full_params.py ... [2022-11-23 03:40:13.344333] 2022-11-23T03:40:13.3452331Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_summon_full_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:40:13.344801] 2022-11-23T03:43:59.2419960Z 2022-11-23T03:43:59.2422833Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_summon_full_params 2022-11-23T03:43:59.2424557Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_summon_full_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_summon_full_params_upzplz3c) 2022-11-23T03:43:59.2425290Z 2022-11-23T03:43:59.2425506Z Running tests... 2022-11-23T03:43:59.2426434Z ---------------------------------------------------------------------- 2022-11-23T03:43:59.2427210Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params 2022-11-23T03:43:59.2427783Z test_cannot_summon_full_params_from_backward (__main__.TestSummonFullParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:43:59.2428298Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92400 2022-11-23T03:43:59.2429121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92401 2022-11-23T03:43:59.2434095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2434689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2435293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2435802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2436404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2436845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2437431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2437920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2438372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2438890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2439573Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2440271Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2440786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2441271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2442566Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2443453Z warnings.warn( 2022-11-23T03:43:59.2444821Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2445624Z warnings.warn( 2022-11-23T03:43:59.2445876Z dist init r=1, world=2 2022-11-23T03:43:59.2446136Z dist init r=0, world=2 2022-11-23T03:43:59.2446375Z ok (6.372s) 2022-11-23T03:43:59.2446837Z test_cannot_summon_full_params_from_forward (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92551 2022-11-23T03:43:59.2447403Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92552 2022-11-23T03:43:59.2448365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2448835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2449437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2449888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2450474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2450919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2451557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2452046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2452515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2453008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2453657Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2454361Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2454878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2455338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2456581Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2457365Z warnings.warn( 2022-11-23T03:43:59.2458514Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2459274Z warnings.warn( 2022-11-23T03:43:59.2459531Z dist init r=1, world=2 2022-11-23T03:43:59.2459762Z dist init r=0, world=2 2022-11-23T03:43:59.2460001Z ok (4.114s) 2022-11-23T03:43:59.2460354Z test_named_parameters_buffers_prefix__recurse_False (__main__.TestSummonFullParams) 2022-11-23T03:43:59.2460880Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92694 2022-11-23T03:43:59.2461457Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92695 2022-11-23T03:43:59.2462080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2462528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2463087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2463567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2464449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2464899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2465451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2465914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2466372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2466847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2467504Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2468295Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2468821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2469276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2469635Z dist init r=0, world=2 2022-11-23T03:43:59.2469890Z dist init r=1, world=2 2022-11-23T03:43:59.2470113Z ok (4.114s) 2022-11-23T03:43:59.2470475Z test_named_parameters_buffers_prefix__recurse_True (__main__.TestSummonFullParams) 2022-11-23T03:43:59.2471020Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92837 2022-11-23T03:43:59.2471541Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92838 2022-11-23T03:43:59.2472130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2472587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2473157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2473604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2474183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2474628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2475203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2475649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2476103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2476599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2477254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2477919Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2478439Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2478976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2479321Z dist init r=0, world=2 2022-11-23T03:43:59.2479570Z dist init r=1, world=2 2022-11-23T03:43:59.2479812Z ok (4.114s) 2022-11-23T03:43:59.2480165Z test_named_parameters_buffers_prefix_test_prefix_recurse_False (__main__.TestSummonFullParams) 2022-11-23T03:43:59.2480723Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92980 2022-11-23T03:43:59.2481252Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92981 2022-11-23T03:43:59.2481870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2482303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2482870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2483342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2483910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2484334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2484900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2485418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2485848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2486355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2487017Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2487712Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2488219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2488685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2489044Z dist init r=1, world=2 2022-11-23T03:43:59.2489278Z dist init r=0, world=2 2022-11-23T03:43:59.2489516Z ok (4.114s) 2022-11-23T03:43:59.2489885Z test_named_parameters_buffers_prefix_test_prefix_recurse_True (__main__.TestSummonFullParams) 2022-11-23T03:43:59.2490435Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93123 2022-11-23T03:43:59.2490934Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93124 2022-11-23T03:43:59.2491542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2492002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2492562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2493035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2493611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2494058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2494620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2495074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2495524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2496064Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2496712Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2497393Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2497915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2498364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2498715Z dist init r=0, world=2 2022-11-23T03:43:59.2498968Z dist init r=1, world=2 2022-11-23T03:43:59.2499209Z ok (4.014s) 2022-11-23T03:43:59.2499708Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93266 2022-11-23T03:43:59.2500313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93267 2022-11-23T03:43:59.2500921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2501353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2501920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2502444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2503020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2503446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2504284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2504755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2505190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2505688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2506350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2507040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2507540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2508010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2508436Z dist init r=0, world=2 2022-11-23T03:43:59.2508695Z dist init r=1, world=2 2022-11-23T03:43:59.2508926Z ok (4.114s) 2022-11-23T03:43:59.2509463Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93409 2022-11-23T03:43:59.2510067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93410 2022-11-23T03:43:59.2510660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2511118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2511699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2512167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2512720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2513246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2513825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2514270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2514719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2515220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2515978Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2516655Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2517175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2517646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2518003Z dist init r=0, world=2 2022-11-23T03:43:59.2518236Z dist init r=1, world=2 2022-11-23T03:43:59.2518483Z ok (4.115s) 2022-11-23T03:43:59.2519012Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93552 2022-11-23T03:43:59.2519677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93553 2022-11-23T03:43:59.2520283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2520734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2521305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2521758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2522335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2522779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2523328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2523795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2524257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2524758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2525397Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2526093Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2526628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2527108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2528172Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2528860Z warnings.warn( 2022-11-23T03:43:59.2529912Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2530587Z warnings.warn( 2022-11-23T03:43:59.2531556Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2532200Z warnings.warn( 2022-11-23T03:43:59.2533159Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2533820Z warnings.warn( 2022-11-23T03:43:59.2534072Z dist init r=0, world=2 2022-11-23T03:43:59.2534310Z dist init r=1, world=2 2022-11-23T03:43:59.2534550Z ok (4.114s) 2022-11-23T03:43:59.2535059Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93695 2022-11-23T03:43:59.2535644Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93696 2022-11-23T03:43:59.2536315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2536767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2537337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2537788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2538364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2538804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2539362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2539826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2540285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2540776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2541413Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2542098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2542625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2543091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2544371Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2545046Z warnings.warn( 2022-11-23T03:43:59.2546020Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2546787Z warnings.warn( 2022-11-23T03:43:59.2547754Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2548384Z warnings.warn( 2022-11-23T03:43:59.2549336Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2549980Z warnings.warn( 2022-11-23T03:43:59.2550231Z dist init r=0, world=2 2022-11-23T03:43:59.2550463Z dist init r=1, world=2 2022-11-23T03:43:59.2550709Z ok (4.114s) 2022-11-23T03:43:59.2551225Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93838 2022-11-23T03:43:59.2551811Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93839 2022-11-23T03:43:59.2552423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2552944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2553516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2553967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2554544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2554989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2555556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2556001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2556455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2556954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2557589Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2558275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2558792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2559265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2559603Z dist init r=1, world=2 2022-11-23T03:43:59.2559853Z dist init r=0, world=2 2022-11-23T03:43:59.2560093Z ok (4.115s) 2022-11-23T03:43:59.2560588Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93981 2022-11-23T03:43:59.2561195Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93982 2022-11-23T03:43:59.2561800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2562250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2562801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2563323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2563906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2564348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2564896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2565360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2565814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2566291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2566941Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2567629Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2568145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2568598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2568942Z dist init r=1, world=2 2022-11-23T03:43:59.2569194Z dist init r=0, world=2 2022-11-23T03:43:59.2569470Z ok (4.115s) 2022-11-23T03:43:59.2569976Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94124 2022-11-23T03:43:59.2570574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94125 2022-11-23T03:43:59.2571180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2571614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2572183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2572654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2573212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2573654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2574224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2574685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2575116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2575605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2576257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2576935Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2577434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2577912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2578265Z dist init r=1, world=2 2022-11-23T03:43:59.2578497Z dist init r=0, world=2 2022-11-23T03:43:59.2578734Z ok (4.115s) 2022-11-23T03:43:59.2579245Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94267 2022-11-23T03:43:59.2579841Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94268 2022-11-23T03:43:59.2580484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2580935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2581505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2581957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2582525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2582962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2583528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2584150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2584627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2585121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2585781Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2586440Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2587058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2587527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2587862Z dist init r=1, world=2 2022-11-23T03:43:59.2588112Z dist init r=0, world=2 2022-11-23T03:43:59.2588353Z ok (4.114s) 2022-11-23T03:43:59.2588851Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94410 2022-11-23T03:43:59.2589445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94411 2022-11-23T03:43:59.2590053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2590507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2591065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2591530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2592104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2592546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2593097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2593561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2594013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2594487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2595138Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2595824Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2596343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2596790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2597136Z dist init r=1, world=2 2022-11-23T03:43:59.2597451Z dist init r=0, world=2 2022-11-23T03:43:59.2597684Z ok (4.115s) 2022-11-23T03:43:59.2598195Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94553 2022-11-23T03:43:59.2598790Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94554 2022-11-23T03:43:59.2599405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2599837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2600413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2600878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2601458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2601878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2602447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2602911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2603399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2603897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2604554Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2605243Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2605744Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2606212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2606566Z dist init r=0, world=2 2022-11-23T03:43:59.2606798Z dist init r=1, world=2 2022-11-23T03:43:59.2607037Z ok (4.014s) 2022-11-23T03:43:59.2607554Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94696 2022-11-23T03:43:59.2608229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94697 2022-11-23T03:43:59.2608830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2609277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2609860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2610317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2610876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2611319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2611889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2612337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2612788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2613282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2613930Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2614715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2615245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2615714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2616070Z dist init r=0, world=2 2022-11-23T03:43:59.2616303Z dist init r=1, world=2 2022-11-23T03:43:59.2616540Z ok (4.115s) 2022-11-23T03:43:59.2617050Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94839 2022-11-23T03:43:59.2617630Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94840 2022-11-23T03:43:59.2618241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2618693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2619270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2619722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2620294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2620792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2621341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2621801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2622247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2622739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2623376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2624258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2624792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2625262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2625597Z dist init r=0, world=2 2022-11-23T03:43:59.2625848Z dist init r=1, world=2 2022-11-23T03:43:59.2626086Z ok (4.115s) 2022-11-23T03:43:59.2626574Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94982 2022-11-23T03:43:59.2627174Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94983 2022-11-23T03:43:59.2627790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2628225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2628799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2629270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2629846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2630267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2630836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2631370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2631830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2632306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2632965Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2633654Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2634155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2634627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2634973Z dist init r=1, world=2 2022-11-23T03:43:59.2635224Z dist init r=0, world=2 2022-11-23T03:43:59.2635446Z ok (4.115s) 2022-11-23T03:43:59.2635959Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95125 2022-11-23T03:43:59.2636556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95126 2022-11-23T03:43:59.2637150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2637670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2638245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2638710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2639263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2639707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2640277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2640737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2641171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2641667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2642317Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2642982Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2643498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2643972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2644325Z dist init r=1, world=2 2022-11-23T03:43:59.2644557Z dist init r=0, world=2 2022-11-23T03:43:59.2644795Z ok (4.315s) 2022-11-23T03:43:59.2645299Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95268 2022-11-23T03:43:59.2645883Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95269 2022-11-23T03:43:59.2646491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2646940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2647510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2647957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2648598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2649050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2649602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2650070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2650522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2651013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2651646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2652331Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2652850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2653319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2653652Z dist init r=0, world=2 2022-11-23T03:43:59.2653902Z dist init r=1, world=2 2022-11-23T03:43:59.2654139Z ok (4.115s) 2022-11-23T03:43:59.2654684Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95411 2022-11-23T03:43:59.2655279Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95412 2022-11-23T03:43:59.2655887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2656337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2656891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2657355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2657933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2658354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2658923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2659383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2659834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2660308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2660961Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2661648Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2662163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2662609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2662960Z dist init r=0, world=2 2022-11-23T03:43:59.2663211Z dist init r=1, world=2 2022-11-23T03:43:59.2663432Z ok (4.114s) 2022-11-23T03:43:59.2663752Z test_raises_rank0_with_writeback (__main__.TestSummonFullParams) 2022-11-23T03:43:59.2664470Z Tests that ``summon_full_params()`` with both ``rank0_only=True`` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95554 2022-11-23T03:43:59.2664974Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95555 2022-11-23T03:43:59.2665669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2666127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2666699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2667148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2667726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2668163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2668732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2669178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2669628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2670115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2670749Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2671426Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2672058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2672529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2672867Z dist init r=1, world=2 2022-11-23T03:43:59.2673116Z dist init r=0, world=2 2022-11-23T03:43:59.2673356Z ok (4.013s) 2022-11-23T03:43:59.2673880Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95697 2022-11-23T03:43:59.2674506Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95698 2022-11-23T03:43:59.2675114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2675559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2676119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2676582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2677151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2677592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2678145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2678602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2679213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2679848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2680504Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2681185Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2681699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2682150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2683602Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2684363Z warnings.warn( 2022-11-23T03:43:59.2685477Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2686209Z warnings.warn( 2022-11-23T03:43:59.2686936Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2687497Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2688262Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2688862Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2689114Z dist init r=1, world=2 2022-11-23T03:43:59.2689356Z dist init r=0, world=2 2022-11-23T03:43:59.2689586Z ok (4.616s) 2022-11-23T03:43:59.2690093Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95848 2022-11-23T03:43:59.2690697Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95849 2022-11-23T03:43:59.2691285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2691716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2692253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2692704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2693262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2693858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2694414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2694876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2695327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2695805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2696465Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2697146Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2697660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2698108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2699405Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2700187Z warnings.warn( 2022-11-23T03:43:59.2701330Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2702096Z warnings.warn( 2022-11-23T03:43:59.2702850Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2703426Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2704414Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2705071Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2705330Z dist init r=1, world=2 2022-11-23T03:43:59.2705577Z dist init r=0, world=2 2022-11-23T03:43:59.2705810Z ok (4.616s) 2022-11-23T03:43:59.2706352Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95999 2022-11-23T03:43:59.2706959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96000 2022-11-23T03:43:59.2707900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2708435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2708995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2709463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2710033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2710474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2711027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2711488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2711938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2712424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2713060Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2713743Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2714260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2714710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2716202Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2716964Z warnings.warn( 2022-11-23T03:43:59.2718065Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2718799Z warnings.warn( 2022-11-23T03:43:59.2719538Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2720072Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2720827Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2721422Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2722392Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2723010Z warnings.warn( 2022-11-23T03:43:59.2724131Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2724775Z warnings.warn( 2022-11-23T03:43:59.2725736Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2726383Z warnings.warn( 2022-11-23T03:43:59.2727313Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2728100Z warnings.warn( 2022-11-23T03:43:59.2728338Z dist init r=1, world=2 2022-11-23T03:43:59.2728580Z dist init r=0, world=2 2022-11-23T03:43:59.2728792Z ok (4.616s) 2022-11-23T03:43:59.2729479Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96150 2022-11-23T03:43:59.2730093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96151 2022-11-23T03:43:59.2730684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2731180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2731761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2732222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2732778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2733225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2733789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2734248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2734680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2735174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2735965Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2736605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2737109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2737612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2738830Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2739571Z warnings.warn( 2022-11-23T03:43:59.2740853Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2741623Z warnings.warn( 2022-11-23T03:43:59.2742393Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2743048Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2744039Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2744593Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2745597Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2746256Z warnings.warn( 2022-11-23T03:43:59.2747283Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2747920Z warnings.warn( 2022-11-23T03:43:59.2748882Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2749692Z warnings.warn( 2022-11-23T03:43:59.2750599Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2751210Z warnings.warn( 2022-11-23T03:43:59.2751433Z dist init r=0, world=2 2022-11-23T03:43:59.2751670Z dist init r=1, world=2 2022-11-23T03:43:59.2751897Z ok (4.616s) 2022-11-23T03:43:59.2752398Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96301 2022-11-23T03:43:59.2753071Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96302 2022-11-23T03:43:59.2753657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2754089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2754623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2755259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2755839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2756262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2756827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2757283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2757894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2758353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2758980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2759638Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2760141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2760577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2762002Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2762766Z warnings.warn( 2022-11-23T03:43:59.2763973Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2764741Z warnings.warn( 2022-11-23T03:43:59.2765495Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2766075Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2766864Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2767425Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2767690Z dist init r=1, world=2 2022-11-23T03:43:59.2767936Z dist init r=0, world=2 2022-11-23T03:43:59.2768175Z ok (4.616s) 2022-11-23T03:43:59.2768692Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96452 2022-11-23T03:43:59.2769317Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96453 2022-11-23T03:43:59.2770316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2770762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2771313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2771778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2772355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2772793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2773501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2773944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2774383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2774839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2775667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2776349Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2776875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2777327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2778578Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2779511Z warnings.warn( 2022-11-23T03:43:59.2780863Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2781636Z warnings.warn( 2022-11-23T03:43:59.2782388Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2782966Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2783762Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2784515Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2784777Z dist init r=0, world=2 2022-11-23T03:43:59.2785029Z dist init r=1, world=2 2022-11-23T03:43:59.2785267Z ok (4.616s) 2022-11-23T03:43:59.2785797Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96603 2022-11-23T03:43:59.2786401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96604 2022-11-23T03:43:59.2787093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2787540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2788094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2788558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2789136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2789576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2790279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2790724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2791164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2791635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2792248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2792908Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2793409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2793844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2795266Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2796040Z warnings.warn( 2022-11-23T03:43:59.2797249Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2798016Z warnings.warn( 2022-11-23T03:43:59.2798786Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2799350Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2800295Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2800844Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2801112Z dist init r=1, world=2 2022-11-23T03:43:59.2801334Z dist init r=0, world=2 2022-11-23T03:43:59.2801565Z ok (4.616s) 2022-11-23T03:43:59.2802076Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96754 2022-11-23T03:43:59.2802650Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96755 2022-11-23T03:43:59.2803334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2803766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2804320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2804933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2805514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2805953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2806501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2806965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2807411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2808312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2808956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2809637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2810158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2810623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2811860Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2812636Z warnings.warn( 2022-11-23T03:43:59.2813834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2814599Z warnings.warn( 2022-11-23T03:43:59.2815533Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2816357Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2817158Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.2817728Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.2818006Z dist init r=1, world=2 2022-11-23T03:43:59.2818239Z dist init r=0, world=2 2022-11-23T03:43:59.2818472Z ok (4.616s) 2022-11-23T03:43:59.2818906Z test_summon_from_non_fsdp (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96905 2022-11-23T03:43:59.2819574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96906 2022-11-23T03:43:59.2820157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2820640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2821191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2821628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2822182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2822607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2823141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2823763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2824425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2824917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2825559Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2826242Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2826756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2827224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2827558Z dist init r=1, world=2 2022-11-23T03:43:59.2827804Z dist init r=0, world=2 2022-11-23T03:43:59.2828042Z ok (4.014s) 2022-11-23T03:43:59.2828544Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97048 2022-11-23T03:43:59.2829157Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97049 2022-11-23T03:43:59.2829772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2830391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2831128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2831600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2832261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2832691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2833265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2833721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2834175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2834654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2835305Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2835994Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2836671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2837106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2838314Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2839121Z warnings.warn( 2022-11-23T03:43:59.2840231Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2841167Z warnings.warn( 2022-11-23T03:43:59.2841396Z dist init r=1, world=2 2022-11-23T03:43:59.2841643Z dist init r=0, world=2 2022-11-23T03:43:59.2841881Z ok (4.014s) 2022-11-23T03:43:59.2842375Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97191 2022-11-23T03:43:59.2842974Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97192 2022-11-23T03:43:59.2843741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2844172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2844708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2845155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2845711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2846121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2846846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2847311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2847761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2848234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2848933Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2849785Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2850282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2850716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2851924Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2852675Z warnings.warn( 2022-11-23T03:43:59.2853972Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2854775Z warnings.warn( 2022-11-23T03:43:59.2855004Z dist init r=1, world=2 2022-11-23T03:43:59.2855250Z dist init r=0, world=2 2022-11-23T03:43:59.2855486Z ok (4.114s) 2022-11-23T03:43:59.2855977Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97334 2022-11-23T03:43:59.2856581Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97335 2022-11-23T03:43:59.2857192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2857638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2858334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2858781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2859333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2859735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2860282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2860862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2861298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2861754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2862571Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2863258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2863777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2864454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2865787Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2866553Z warnings.warn( 2022-11-23T03:43:59.2867697Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2868463Z warnings.warn( 2022-11-23T03:43:59.2868692Z dist init r=1, world=2 2022-11-23T03:43:59.2869099Z dist init r=0, world=2 2022-11-23T03:43:59.2869328Z ok (4.114s) 2022-11-23T03:43:59.2869807Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97477 2022-11-23T03:43:59.2870396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97478 2022-11-23T03:43:59.2870980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2871479Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2872017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2872464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2873017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2873440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2873970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2874416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2874849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2875486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2876139Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2876820Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2877333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2877782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2879028Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2879954Z warnings.warn( 2022-11-23T03:43:59.2881312Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2882078Z warnings.warn( 2022-11-23T03:43:59.2882305Z dist init r=0, world=2 2022-11-23T03:43:59.2882548Z dist init r=1, world=2 2022-11-23T03:43:59.2882783Z ok (4.115s) 2022-11-23T03:43:59.2883278Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97620 2022-11-23T03:43:59.2884038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97621 2022-11-23T03:43:59.2884629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2885061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2885598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2886050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2886604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2887027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2887750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2888274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2888727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2889200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2889862Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2890703Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2891212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2891651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2892867Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2893617Z warnings.warn( 2022-11-23T03:43:59.2894918Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2895692Z warnings.warn( 2022-11-23T03:43:59.2895926Z dist init r=0, world=2 2022-11-23T03:43:59.2896184Z dist init r=1, world=2 2022-11-23T03:43:59.2896418Z ok (4.115s) 2022-11-23T03:43:59.2896911Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97763 2022-11-23T03:43:59.2897511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97764 2022-11-23T03:43:59.2898112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2898606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2899171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2899649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2900395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2901022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2901577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2902049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2902506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2902989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2903801Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2904882Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2905408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2905943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2907203Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2908142Z warnings.warn( 2022-11-23T03:43:59.2909548Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2910311Z warnings.warn( 2022-11-23T03:43:59.2910545Z dist init r=0, world=2 2022-11-23T03:43:59.2910800Z dist init r=1, world=2 2022-11-23T03:43:59.2911046Z ok (4.114s) 2022-11-23T03:43:59.2911543Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97906 2022-11-23T03:43:59.2912144Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97907 2022-11-23T03:43:59.2912749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2913206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2913762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2914241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2914826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2915276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2915988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2916696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2917173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2917654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2918319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2919010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2919688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2920131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2921352Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2922107Z warnings.warn( 2022-11-23T03:43:59.2923431Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2924256Z warnings.warn( 2022-11-23T03:43:59.2924490Z dist init r=1, world=2 2022-11-23T03:43:59.2924746Z dist init r=0, world=2 2022-11-23T03:43:59.2924996Z ok (4.115s) 2022-11-23T03:43:59.2925490Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98049 2022-11-23T03:43:59.2926098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98050 2022-11-23T03:43:59.2926713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2927171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2927730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2928203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2928931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2929369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2929906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2930363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2930810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2931275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2931912Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2932577Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2933085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2933571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2934798Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2935531Z warnings.warn( 2022-11-23T03:43:59.2936637Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.2937377Z warnings.warn( 2022-11-23T03:43:59.2937597Z dist init r=0, world=2 2022-11-23T03:43:59.2937835Z dist init r=1, world=2 2022-11-23T03:43:59.2938064Z ok (4.114s) 2022-11-23T03:43:59.2938501Z test_summon_full_param_shard_value_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98192 2022-11-23T03:43:59.2939116Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98193 2022-11-23T03:43:59.2939706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2940139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2940674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2941310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2941887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2942328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2942879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2943343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2943795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2944624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2945265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2945926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2946425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2946856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2947386Z dist init r=0, world=2 2022-11-23T03:43:59.2947641Z dist init r=1, world=2 2022-11-23T03:43:59.2947860Z ok (4.014s) 2022-11-23T03:43:59.2948328Z test_summon_full_param_shard_value_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98335 2022-11-23T03:43:59.2948884Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98336 2022-11-23T03:43:59.2949491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2949991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2950579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2951046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2951618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2952204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2952750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2953377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2953808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2954298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2954949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2955633Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2956134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2956674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2957027Z dist init r=0, world=2 2022-11-23T03:43:59.2957258Z dist init r=1, world=2 2022-11-23T03:43:59.2957495Z ok (4.014s) 2022-11-23T03:43:59.2957935Z test_summon_full_param_writeback (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98478 2022-11-23T03:43:59.2958614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98479 2022-11-23T03:43:59.2959188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2959618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2960167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2960596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2961156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2961577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2962122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2962733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2963186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2963676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2964325Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2964989Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2965669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2966122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2966626Z dist init r=1, world=2 2022-11-23T03:43:59.2966874Z dist init r=0, world=2 2022-11-23T03:43:59.2967111Z ok (4.314s) 2022-11-23T03:43:59.2967655Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98621 2022-11-23T03:43:59.2968227Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98622 2022-11-23T03:43:59.2968829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2969438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2969974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2970424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2970981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2971407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2971936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2972389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2972820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2973291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2974093Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2974829Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2975342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2975789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2976139Z dist init r=0, world=2 2022-11-23T03:43:59.2976387Z dist init r=1, world=2 2022-11-23T03:43:59.2976611Z ok (4.014s) 2022-11-23T03:43:59.2977103Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98764 2022-11-23T03:43:59.2977680Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98765 2022-11-23T03:43:59.2978285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2978724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2979295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2979760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2980476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2981068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2981635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2982093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2982524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2983013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2983662Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2984683Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.2985167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.2985685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.2986732Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2987376Z warnings.warn( 2022-11-23T03:43:59.2988274Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2988899Z warnings.warn( 2022-11-23T03:43:59.2989832Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2990470Z warnings.warn( 2022-11-23T03:43:59.2991384Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T03:43:59.2992071Z warnings.warn( 2022-11-23T03:43:59.2992292Z dist init r=0, world=2 2022-11-23T03:43:59.2992530Z dist init r=1, world=2 2022-11-23T03:43:59.2992755Z ok (4.114s) 2022-11-23T03:43:59.2993215Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98907 2022-11-23T03:43:59.2993780Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98908 2022-11-23T03:43:59.2994366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2994958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2995526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2995992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2996560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.2996982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.2997551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.2998014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.2998446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.2998934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.2999590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3000276Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3000777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3001004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3001183Z dist init r=0, world=2 2022-11-23T03:43:59.3001296Z dist init r=1, world=2 2022-11-23T03:43:59.3001395Z ok (4.114s) 2022-11-23T03:43:59.3001761Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99050 2022-11-23T03:43:59.3001977Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99051 2022-11-23T03:43:59.3002357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3002677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3003040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3003223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3003577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3003924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3004298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3004488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3004777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.3004994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3005392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3005783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3006016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3006242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3006354Z dist init r=0, world=2 2022-11-23T03:43:59.3006461Z dist init r=1, world=2 2022-11-23T03:43:59.3006560Z ok (4.114s) 2022-11-23T03:43:59.3006910Z test_summon_full_params_respects_reshard_after_forward (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99193 2022-11-23T03:43:59.3007116Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99194 2022-11-23T03:43:59.3007490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3007665Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3008045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3008470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3009013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3009185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3009564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3009739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3009983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.3010224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3010619Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3011059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3011292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3011519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3012539Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.3012651Z warnings.warn( 2022-11-23T03:43:59.3013660Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.3013817Z warnings.warn( 2022-11-23T03:43:59.3014451Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.3014575Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.3015205Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T03:43:59.3015346Z warnings.warn(message, UserWarning) 2022-11-23T03:43:59.3015456Z dist init r=1, world=2 2022-11-23T03:43:59.3015564Z dist init r=0, world=2 2022-11-23T03:43:59.3015664Z ok (4.615s) 2022-11-23T03:43:59.3016131Z test_summon_single_param (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99336 2022-11-23T03:43:59.3016344Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99337 2022-11-23T03:43:59.3016683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3016851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3017212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3017395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3017754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3017923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3018283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3018469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3018703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.3018914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3019298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3019724Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3019950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3020169Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3021149Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.3021263Z warnings.warn( 2022-11-23T03:43:59.3022234Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:43:59.3022341Z warnings.warn( 2022-11-23T03:43:59.3022447Z dist init r=1, world=2 2022-11-23T03:43:59.3022550Z dist init r=0, world=2 2022-11-23T03:43:59.3022674Z ok (4.113s) 2022-11-23T03:43:59.3022843Z test_with_grads_core (__main__.TestSummonFullParams) 2022-11-23T03:43:59.3023139Z Tests the core usage of ``summon_full_params(with_grads=True)``. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99479 2022-11-23T03:43:59.3023347Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99480 2022-11-23T03:43:59.3023703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3024063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3024612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3024784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3025140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3025333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3025710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3025895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3026140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.3026378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3026775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3027170Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3027398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3027609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3027845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3028079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3028302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3028752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3028978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3029192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3029404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3029600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3029816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3030031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3030244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3030456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3030672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3030884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3031098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3031292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3031566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3031779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3031991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3032204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3032415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3032629Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3032842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3033054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:43:59.3033144Z dist init r=1, world=2 2022-11-23T03:43:59.3033250Z dist init r=0, world=2 2022-11-23T03:43:59.3033351Z ok (7.220s) 2022-11-23T03:43:59.3033533Z test_with_grads_none_grads (__main__.TestSummonFullParams) 2022-11-23T03:43:59.3033970Z Tests that if all ranks' ``FlatParameter`` has ``None`` gradient, then ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99630 2022-11-23T03:43:59.3034180Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99631 2022-11-23T03:43:59.3034534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3034707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3035056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3035241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3035592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3035763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3036123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3036306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3036538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:43:59.3036848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3037226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3037606Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:43:59.3037828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3038051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:43:59.3038159Z dist init r=1, world=2 2022-11-23T03:43:59.3038262Z dist init r=0, world=2 2022-11-23T03:43:59.3038359Z ok (4.214s) 2022-11-23T03:43:59.3038682Z test_summon_full_param_writeback (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99773 2022-11-23T03:43:59.3039023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:43:59.3039194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:43:59.3039554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:43:59.3039739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:43:59.3039972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:43:59.3040401Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:43:59.3040620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:43:59.3040729Z dist init r=0, world=1 2022-11-23T03:43:59.3040824Z ok (4.012s) 2022-11-23T03:43:59.3040845Z 2022-11-23T03:43:59.3041269Z ---------------------------------------------------------------------- 2022-11-23T03:43:59.3041389Z Ran 52 tests in 223.427s 2022-11-23T03:43:59.3041409Z 2022-11-23T03:43:59.3041499Z OK 2022-11-23T03:43:59.3041518Z 2022-11-23T03:43:59.3041642Z Generating XML reports... 2022-11-23T03:43:59.3042112Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20221123034015.xml 2022-11-23T03:43:59.3042609Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20221123034015.xml 2022-11-23T03:43:59.3042632Z 2022-11-23T03:43:59.3043069Z ##[endgroup] 2022-11-23T03:43:59.3043591Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_summon_full_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_summon_full_params_upzplz3c) 2022-11-23T03:43:59.3043612Z 2022-11-23T03:43:59.5886871Z 2022-11-23T03:43:59.5887320Z real 3m51.473s 2022-11-23T03:43:59.5887423Z user 7m43.772s 2022-11-23T03:43:59.5887533Z sys 6m18.800s 2022-11-23T03:43:59.5887704Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:43:59.5888104Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_tp_integration.py 2022-11-23T03:44:01.9825648Z Ignoring disabled issues: [] 2022-11-23T03:44:02.0371527Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:44:02.0372136Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:44:02.0372507Z Selected tests: 2022-11-23T03:44:02.0372794Z distributed/fsdp/test_fsdp_tp_integration.py 2022-11-23T03:44:02.0398067Z Prioritized test from test file changes. 2022-11-23T03:44:02.0398409Z reordering tests for PR: 2022-11-23T03:44:02.0398667Z prioritized: [] 2022-11-23T03:44:02.0399183Z the rest: ['distributed/fsdp/test_fsdp_tp_integration.py'] 2022-11-23T03:44:02.0399409Z 2022-11-23T03:44:02.0400270Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:44:02.0401244Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:44:02.0406977Z parallel (file granularity) tests: 2022-11-23T03:44:02.0407255Z 2022-11-23T03:44:02.0407590Z serial (file granularity) tests: 2022-11-23T03:44:02.0408138Z distributed/fsdp/test_fsdp_tp_integration.py 2022-11-23T03:44:04.3701873Z Ignoring disabled issues: [] 2022-11-23T03:44:04.8062415Z Running distributed/fsdp/test_fsdp_tp_integration.py ... [2022-11-23 03:44:04.805663] 2022-11-23T03:44:04.8063525Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_tp_integration.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:44:04.806168] 2022-11-23T03:44:35.6255069Z 2022-11-23T03:44:35.6255899Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_tp_integration 2022-11-23T03:44:35.6256907Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_tp_integration (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_tp_integration_0r4krzfk) 2022-11-23T03:44:35.6258026Z 2022-11-23T03:44:35.6258223Z Running tests... 2022-11-23T03:44:35.6261980Z ---------------------------------------------------------------------- 2022-11-23T03:44:35.6262881Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration 2022-11-23T03:44:35.6263371Z test_fsdp_tp_checkpoint_integration (__main__.TestTPFSDPIntegration) 2022-11-23T03:44:35.6263840Z Tests checkpointing for TP + FSDP integration. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:44:35.6264601Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100057 2022-11-23T03:44:35.6266523Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100058 2022-11-23T03:44:35.6266983Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100059 2022-11-23T03:44:35.6267420Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100060 2022-11-23T03:44:35.6268063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6268520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6269104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6269576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6270140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6270580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6271204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6271670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6272227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6272671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6273239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6273697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6274268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6274715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6275295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6275874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6276348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:35.6276955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:35.6277449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:44:35.6277933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:44:35.6278584Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6279257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6279937Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6280612Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6281126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:35.6281577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:35.6282117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:44:35.6282635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:44:35.6283102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:44:35.6283586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:44:35.6284067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:44:35.6284549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:44:35.6285190Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6285867Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6286549Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6287216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6287718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:44:35.6288200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:44:35.6288681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:44:35.6289143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:44:35.6289784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6290458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6291127Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6291772Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6292285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:44:35.6292821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:44:35.6293309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:44:35.6293770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:44:35.6294402Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6295073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6295739Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6296239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:44:35.6296722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:44:35.6297202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:44:35.6297843Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6298347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:44:35.6299041Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6299762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6300430Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6301077Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6301982Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6302609Z warnings.warn( 2022-11-23T03:44:35.6303354Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6304115Z warnings.warn( 2022-11-23T03:44:35.6304876Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6305411Z warnings.warn( 2022-11-23T03:44:35.6306150Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6306658Z warnings.warn( 2022-11-23T03:44:35.6306904Z dist init r=3, world=4 2022-11-23T03:44:35.6307150Z dist init r=0, world=4 2022-11-23T03:44:35.6307379Z dist init r=1, world=4 2022-11-23T03:44:35.6307626Z dist init r=2, world=4 2022-11-23T03:44:35.6307856Z ok (7.066s) 2022-11-23T03:44:35.6308269Z test_fsdp_tp_integration_tensor_parallel_size_2_cpu_offload_CPUOffload(offload_params=False) (__main__.TestTPFSDPIntegration) 2022-11-23T03:44:35.6308998Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100402 2022-11-23T03:44:35.6309535Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100403 2022-11-23T03:44:35.6310063Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100404 2022-11-23T03:44:35.6310500Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100405 2022-11-23T03:44:35.6311108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6311556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6312129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6312578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6313152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6313597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6314166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6314611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6315178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6315620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6316164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6316701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6317272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6317708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6318255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6318720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6319166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:35.6319638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:35.6320121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:44:35.6320604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:44:35.6321251Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6321914Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6322596Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6323270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6323789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:35.6324238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:44:35.6324697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:44:35.6325160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:35.6325639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:44:35.6326108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:44:35.6326587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:44:35.6327148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:44:35.6327783Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6328456Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6328983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:44:35.6329464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:44:35.6330093Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6330618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:44:35.6331256Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6331780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:44:35.6332403Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6332986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:44:35.6333689Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6334363Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6334869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:44:35.6335356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:44:35.6335992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6336496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:44:35.6337188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6337709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:44:35.6338339Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6338842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:44:35.6339487Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6340151Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6340669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:44:35.6341134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:44:35.6341772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6342445Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6343136Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6344087Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6345019Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6345565Z warnings.warn( 2022-11-23T03:44:35.6346308Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6346822Z warnings.warn( 2022-11-23T03:44:35.6347558Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6348090Z warnings.warn( 2022-11-23T03:44:35.6348828Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6349341Z warnings.warn( 2022-11-23T03:44:35.6349585Z dist init r=3, world=4 2022-11-23T03:44:35.6349832Z dist init r=1, world=4 2022-11-23T03:44:35.6350058Z dist init r=0, world=4 2022-11-23T03:44:35.6350392Z dist init r=2, world=4 2022-11-23T03:44:35.6350624Z ok (5.321s) 2022-11-23T03:44:35.6351019Z test_fsdp_tp_integration_tensor_parallel_size_2_cpu_offload_CPUOffload(offload_params=True) (__main__.TestTPFSDPIntegration) 2022-11-23T03:44:35.6351763Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100747 2022-11-23T03:44:35.6352343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100748 2022-11-23T03:44:35.6352789Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100749 2022-11-23T03:44:35.6353212Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100750 2022-11-23T03:44:35.6353815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6354264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6354841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6355290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6355860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6356300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6356848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6357308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6357874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6358306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6358854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6359314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6359879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6360316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6360860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6361404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6361860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:44:35.6362329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:35.6362813Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:35.6363297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:44:35.6363943Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6364605Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6365284Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6365959Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6366468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:35.6367043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:35.6367555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:44:35.6368015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:44:35.6368472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:44:35.6368956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:44:35.6369440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:44:35.6369966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:44:35.6370596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6371119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:44:35.6371756Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6372428Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6372938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:44:35.6373419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:44:35.6374066Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6374592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:44:35.6375208Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6375730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:44:35.6376367Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6376873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:44:35.6377514Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6378233Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6378762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:44:35.6379222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:44:35.6379858Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6380533Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6381055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:44:35.6381518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:44:35.6382161Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6382749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:44:35.6383480Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6384156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:44:35.6384871Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6385540Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6386206Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6386858Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6387762Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6388320Z warnings.warn( 2022-11-23T03:44:35.6389070Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6389589Z warnings.warn( 2022-11-23T03:44:35.6390322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6390850Z warnings.warn( 2022-11-23T03:44:35.6391595Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6392104Z warnings.warn( 2022-11-23T03:44:35.6392344Z dist init r=3, world=4 2022-11-23T03:44:35.6392590Z dist init r=1, world=4 2022-11-23T03:44:35.6392822Z dist init r=0, world=4 2022-11-23T03:44:35.6393063Z dist init r=2, world=4 2022-11-23T03:44:35.6393292Z ok (5.321s) 2022-11-23T03:44:35.6393690Z test_fsdp_tp_integration_tensor_parallel_size_4_cpu_offload_CPUOffload(offload_params=False) (__main__.TestTPFSDPIntegration) 2022-11-23T03:44:35.6394433Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101092 2022-11-23T03:44:35.6394972Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101093 2022-11-23T03:44:35.6395492Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 101094 2022-11-23T03:44:35.6395927Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 101095 2022-11-23T03:44:35.6396528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6396974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6397527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6397993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6398561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6399003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6399552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6400013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6400581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6401015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6401563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6402102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6402670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6403087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6403650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6404107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6404552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:35.6405024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:44:35.6405503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:35.6405985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:44:35.6406627Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6407292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6407974Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6408651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6409232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:44:35.6409699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:35.6410146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:44:35.6410606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:35.6411078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:44:35.6411562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:44:35.6412081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:44:35.6412567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:44:35.6413214Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6413734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:44:35.6414359Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6414878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:44:35.6415512Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6416035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:44:35.6416662Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6417179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:44:35.6417811Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6418372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:44:35.6419009Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6419529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:44:35.6420160Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6420669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:44:35.6421297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6421817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:44:35.6422455Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6422961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:44:35.6423596Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6424345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:44:35.6424992Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6425501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:44:35.6426132Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6426650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:44:35.6427292Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6427795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:44:35.6428426Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6428943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:44:35.6429635Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6430166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:44:35.6430807Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6431329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:44:35.6431947Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6432620Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6433290Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6434009Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6434893Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6450351Z warnings.warn( 2022-11-23T03:44:35.6451199Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6451781Z warnings.warn( 2022-11-23T03:44:35.6452567Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6453145Z warnings.warn( 2022-11-23T03:44:35.6453942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6454514Z warnings.warn( 2022-11-23T03:44:35.6454760Z dist init r=2, world=4 2022-11-23T03:44:35.6455017Z dist init r=3, world=4 2022-11-23T03:44:35.6455279Z dist init r=1, world=4 2022-11-23T03:44:35.6455521Z dist init r=0, world=4 2022-11-23T03:44:35.6455765Z ok (5.421s) 2022-11-23T03:44:35.6456201Z test_fsdp_tp_integration_tensor_parallel_size_4_cpu_offload_CPUOffload(offload_params=True) (__main__.TestTPFSDPIntegration) 2022-11-23T03:44:35.6456976Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101438 2022-11-23T03:44:35.6457552Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101439 2022-11-23T03:44:35.6458029Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 101440 2022-11-23T03:44:35.6458505Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 101441 2022-11-23T03:44:35.6459129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6459610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6460214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6460706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6461297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6461770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6462471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6462939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6463550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6464374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6464976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6465428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6465999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:35.6466435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:35.6466999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:35.6467468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:35.6467915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:44:35.6468412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:44:35.6469011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:35.6469503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:35.6470161Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6470848Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6471513Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6472192Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:44:35.6472706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:44:35.6473172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:35.6473621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:44:35.6474072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:35.6474551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:44:35.6475020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:44:35.6475511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:44:35.6475996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:44:35.6476644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6477305Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6477827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T03:44:35.6478312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T03:44:35.6478945Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6479517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T03:44:35.6480153Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:44:35.6480677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T03:44:35.6481309Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6481817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T03:44:35.6482514Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6483186Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6483700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 2 2022-11-23T03:44:35.6484188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T03:44:35.6484827Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T03:44:35.6485348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 3 2022-11-23T03:44:35.6486036Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6486556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 3 2022-11-23T03:44:35.6487198Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6487710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T03:44:35.6488323Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6488985Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 4 nodes. 2022-11-23T03:44:35.6489508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T03:44:35.6489999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 2 2022-11-23T03:44:35.6490616Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6491289Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6491815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T03:44:35.6492391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 2 2022-11-23T03:44:35.6493010Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6493529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 3 2022-11-23T03:44:35.6494162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 4 nodes. 2022-11-23T03:44:35.6494665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T03:44:35.6495300Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6495965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6496702Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6497361Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:6 with 4 nodes. 2022-11-23T03:44:35.6498261Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6498811Z warnings.warn( 2022-11-23T03:44:35.6499578Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6500126Z warnings.warn( 2022-11-23T03:44:35.6500857Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6501401Z warnings.warn( 2022-11-23T03:44:35.6502138Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:44:35.6502727Z warnings.warn( 2022-11-23T03:44:35.6502959Z dist init r=1, world=4 2022-11-23T03:44:35.6503214Z dist init r=0, world=4 2022-11-23T03:44:35.6503462Z dist init r=3, world=4 2022-11-23T03:44:35.6503689Z dist init r=2, world=4 2022-11-23T03:44:35.6504170Z ok (5.221s) 2022-11-23T03:44:35.6504327Z 2022-11-23T03:44:35.6504610Z ---------------------------------------------------------------------- 2022-11-23T03:44:35.6504922Z Ran 5 tests in 28.352s 2022-11-23T03:44:35.6505081Z 2022-11-23T03:44:35.6505175Z OK 2022-11-23T03:44:35.6505308Z 2022-11-23T03:44:35.6505438Z Generating XML reports... 2022-11-23T03:44:35.6506055Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration/TEST-TestTPFSDPIntegration-20221123034406.xml 2022-11-23T03:44:35.6506440Z 2022-11-23T03:44:35.6507001Z ##[endgroup] 2022-11-23T03:44:35.6507628Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_tp_integration (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_tp_integration_0r4krzfk) 2022-11-23T03:44:35.6508008Z 2022-11-23T03:44:36.0456410Z 2022-11-23T03:44:36.0456781Z real 0m36.457s 2022-11-23T03:44:36.0457096Z user 1m44.464s 2022-11-23T03:44:36.0457345Z sys 1m9.946s 2022-11-23T03:44:36.0457614Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:44:36.0458222Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_traversal.py 2022-11-23T03:44:38.3926772Z Ignoring disabled issues: [] 2022-11-23T03:44:38.4464792Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:44:38.4465355Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:44:38.4465717Z Selected tests: 2022-11-23T03:44:38.4466006Z distributed/fsdp/test_fsdp_traversal.py 2022-11-23T03:44:38.4491478Z Prioritized test from test file changes. 2022-11-23T03:44:38.4492035Z reordering tests for PR: 2022-11-23T03:44:38.4492535Z prioritized: [] 2022-11-23T03:44:38.4493051Z the rest: ['distributed/fsdp/test_fsdp_traversal.py'] 2022-11-23T03:44:38.4493270Z 2022-11-23T03:44:38.4493784Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:44:38.4494726Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:44:38.4500936Z parallel (file granularity) tests: 2022-11-23T03:44:38.4501363Z 2022-11-23T03:44:38.4501606Z serial (file granularity) tests: 2022-11-23T03:44:38.4501925Z distributed/fsdp/test_fsdp_traversal.py 2022-11-23T03:44:40.7945489Z Ignoring disabled issues: [] 2022-11-23T03:44:41.2039997Z Running distributed/fsdp/test_fsdp_traversal.py ... [2022-11-23 03:44:41.203273] 2022-11-23T03:44:41.2040699Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:44:41.203698] 2022-11-23T03:44:49.4326406Z 2022-11-23T03:44:49.4327250Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_traversal 2022-11-23T03:44:49.4328331Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_traversal_jxiof4on) 2022-11-23T03:44:49.4328714Z 2022-11-23T03:44:49.4328809Z Running tests... 2022-11-23T03:44:49.4329431Z ---------------------------------------------------------------------- 2022-11-23T03:44:49.4330032Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal 2022-11-23T03:44:49.4330544Z test_fsdp_modules (__main__.TestTraversal) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:44:49.4331023Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101996 2022-11-23T03:44:49.4331851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101997 2022-11-23T03:44:49.4332542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:49.4333031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:49.4333619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:49.4334134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:49.4334754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:44:49.4335126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:44:49.4335740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:44:49.4336248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:44:49.4336740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:44:49.4337252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:44:49.4337949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:44:49.4338681Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:44:49.4339241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:44:49.4339719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:44:49.4340086Z dist init r=0, world=2 2022-11-23T03:44:49.4340360Z dist init r=1, world=2 2022-11-23T03:44:49.4340599Z ok (5.761s) 2022-11-23T03:44:49.4340762Z 2022-11-23T03:44:49.4341050Z ---------------------------------------------------------------------- 2022-11-23T03:44:49.4341410Z Ran 1 test in 5.761s 2022-11-23T03:44:49.4341604Z 2022-11-23T03:44:49.4341701Z OK 2022-11-23T03:44:49.4341822Z 2022-11-23T03:44:49.4341956Z Generating XML reports... 2022-11-23T03:44:49.4342575Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal/TEST-TestTraversal-20221123034443.xml 2022-11-23T03:44:49.4342935Z 2022-11-23T03:44:49.4343273Z ##[endgroup] 2022-11-23T03:44:49.4344495Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_traversal_jxiof4on) 2022-11-23T03:44:49.4344887Z 2022-11-23T03:44:49.7744158Z 2022-11-23T03:44:49.7744964Z real 0m13.729s 2022-11-23T03:44:49.7745317Z user 0m23.830s 2022-11-23T03:44:49.7745563Z sys 0m21.932s 2022-11-23T03:44:49.7745834Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:44:49.7746405Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_uneven.py 2022-11-23T03:44:52.1232980Z Ignoring disabled issues: [] 2022-11-23T03:44:52.1761338Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:44:52.1761932Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:44:52.1762299Z Selected tests: 2022-11-23T03:44:52.1762555Z distributed/fsdp/test_fsdp_uneven.py 2022-11-23T03:44:52.1790003Z Prioritized test from test file changes. 2022-11-23T03:44:52.1790381Z reordering tests for PR: 2022-11-23T03:44:52.1790651Z prioritized: [] 2022-11-23T03:44:52.1791188Z the rest: ['distributed/fsdp/test_fsdp_uneven.py'] 2022-11-23T03:44:52.1791424Z 2022-11-23T03:44:52.1791961Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:44:52.1792901Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:44:52.1798678Z parallel (file granularity) tests: 2022-11-23T03:44:52.1798961Z 2022-11-23T03:44:52.1799223Z serial (file granularity) tests: 2022-11-23T03:44:52.1799503Z distributed/fsdp/test_fsdp_uneven.py 2022-11-23T03:44:54.4848750Z Ignoring disabled issues: [] 2022-11-23T03:44:54.8913055Z Running distributed/fsdp/test_fsdp_uneven.py ... [2022-11-23 03:44:54.890707] 2022-11-23T03:44:54.8914125Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_uneven.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:44:54.891137] 2022-11-23T03:45:03.9141801Z 2022-11-23T03:45:03.9142294Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_uneven 2022-11-23T03:45:03.9143578Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_uneven_zo0htdue) 2022-11-23T03:45:03.9144327Z 2022-11-23T03:45:03.9144455Z Running tests... 2022-11-23T03:45:03.9145104Z ---------------------------------------------------------------------- 2022-11-23T03:45:03.9145726Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven 2022-11-23T03:45:03.9146180Z test_one_iteration (__main__.TestUnevenParamShard) 2022-11-23T03:45:03.9146612Z Test FSDP with uneven divide of parameter shards. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:45:03.9147091Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102351 2022-11-23T03:45:03.9147527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102352 2022-11-23T03:45:03.9147964Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 102353 2022-11-23T03:45:03.9148409Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 102354 2022-11-23T03:45:03.9149023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:45:03.9149490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:45:03.9150075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:45:03.9150556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:45:03.9151117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:45:03.9151838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:45:03.9152441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:45:03.9152910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:45:03.9153474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:45:03.9153931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:45:03.9154507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:45:03.9154954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:45:03.9155535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:45:03.9155993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:45:03.9156561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:45:03.9157003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:45:03.9157455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:45:03.9158080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:45:03.9158567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:45:03.9159034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:45:03.9159694Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:45:03.9160381Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:45:03.9161064Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:45:03.9161715Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:45:03.9162234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:45:03.9162703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:45:03.9163150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:45:03.9163613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:45:03.9163960Z dist init r=1, world=4 2022-11-23T03:45:03.9164209Z dist init r=0, world=4 2022-11-23T03:45:03.9164437Z dist init r=2, world=4 2022-11-23T03:45:03.9164687Z dist init r=3, world=4 2022-11-23T03:45:03.9164921Z ok (6.581s) 2022-11-23T03:45:03.9165050Z 2022-11-23T03:45:03.9165327Z ---------------------------------------------------------------------- 2022-11-23T03:45:03.9165656Z Ran 1 test in 6.582s 2022-11-23T03:45:03.9165815Z 2022-11-23T03:45:03.9165907Z OK 2022-11-23T03:45:03.9166039Z 2022-11-23T03:45:03.9166146Z Generating XML reports... 2022-11-23T03:45:03.9166758Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20221123034456.xml 2022-11-23T03:45:03.9167118Z 2022-11-23T03:45:03.9167442Z ##[endgroup] 2022-11-23T03:45:03.9168018Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_uneven_zo0htdue) 2022-11-23T03:45:03.9168371Z 2022-11-23T03:45:04.3001602Z 2022-11-23T03:45:04.3002114Z real 0m14.526s 2022-11-23T03:45:04.3002363Z user 0m31.881s 2022-11-23T03:45:04.3002617Z sys 0m25.267s 2022-11-23T03:45:04.3003374Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:45:04.3003973Z + python test/run_test.py --verbose -i distributed/fsdp/test_fsdp_use_orig_params.py 2022-11-23T03:45:06.6919866Z Ignoring disabled issues: [] 2022-11-23T03:45:06.7451745Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:45:06.7452934Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:45:06.7453305Z Selected tests: 2022-11-23T03:45:06.7453602Z distributed/fsdp/test_fsdp_use_orig_params.py 2022-11-23T03:45:06.7482414Z Prioritized test from test file changes. 2022-11-23T03:45:06.7482785Z reordering tests for PR: 2022-11-23T03:45:06.7483088Z prioritized: [] 2022-11-23T03:45:06.7483622Z the rest: ['distributed/fsdp/test_fsdp_use_orig_params.py'] 2022-11-23T03:45:06.7483863Z 2022-11-23T03:45:06.7484331Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:45:06.7485297Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:45:06.7490969Z parallel (file granularity) tests: 2022-11-23T03:45:06.7491672Z 2022-11-23T03:45:06.7491945Z serial (file granularity) tests: 2022-11-23T03:45:06.7492284Z distributed/fsdp/test_fsdp_use_orig_params.py 2022-11-23T03:45:09.0349207Z Ignoring disabled issues: [] 2022-11-23T03:45:09.4793095Z Running distributed/fsdp/test_fsdp_use_orig_params.py ... [2022-11-23 03:45:09.478705] 2022-11-23T03:45:09.4794406Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_use_orig_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:45:09.479141] 2022-11-23T03:48:19.6584248Z 2022-11-23T03:48:19.6585176Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_use_orig_params 2022-11-23T03:48:19.6586206Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_use_orig_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_use_orig_params_ttwhn5ja) 2022-11-23T03:48:19.6592217Z 2022-11-23T03:48:19.6592484Z Running tests... 2022-11-23T03:48:19.6593066Z ---------------------------------------------------------------------- 2022-11-23T03:48:19.6593685Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params 2022-11-23T03:48:19.6594183Z test_named_parameters_in_forward (__main__.TestFSDPUseOrigParamsFQNs) 2022-11-23T03:48:19.6594779Z Tests that calling ``named_parameters()`` during forward returns FQNs ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:48:19.6595321Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102864 2022-11-23T03:48:19.6595762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102865 2022-11-23T03:48:19.6596205Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 102866 2022-11-23T03:48:19.6596652Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 102867 2022-11-23T03:48:19.6597261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6598047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6598700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6599192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6599754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6600205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6601187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6601697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6602263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6602720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6603296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6603744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6604324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6604775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6605355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6605814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6606277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:48:19.6606782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6607415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6608023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:48:19.6608687Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6609366Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6610052Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6610736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6611259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6611720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6612190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:48:19.6612663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:48:19.6613000Z dist init r=1, world=4 2022-11-23T03:48:19.6613260Z dist init r=2, world=4 2022-11-23T03:48:19.6613510Z dist init r=3, world=4 2022-11-23T03:48:19.6613765Z dist init r=0, world=4 2022-11-23T03:48:19.6613992Z ok (6.691s) 2022-11-23T03:48:19.6614332Z test_param_and_buffer_names (__main__.TestFSDPUseOrigParamsFQNs) 2022-11-23T03:48:19.6614989Z Tests that, for ``use_orig_params=True``, the parameter and buffer ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103149 2022-11-23T03:48:19.6615514Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103150 2022-11-23T03:48:19.6615966Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 103151 2022-11-23T03:48:19.6616420Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 103152 2022-11-23T03:48:19.6617043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6617478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6618057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6618596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6619167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6619610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6620168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6620616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6621167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6621632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6622213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6622655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6623230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6623669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6624664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6625113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6625680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6626308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:48:19.6626794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:48:19.6627261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6627920Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6628607Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6629287Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6629951Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:48:19.6630548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:48:19.6631020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6631466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6631937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:48:19.6633197Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6633983Z warnings.warn( 2022-11-23T03:48:19.6635205Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6635983Z warnings.warn( 2022-11-23T03:48:19.6637129Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6637893Z warnings.warn( 2022-11-23T03:48:19.6639022Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6639763Z warnings.warn( 2022-11-23T03:48:19.6640013Z dist init r=3, world=4 2022-11-23T03:48:19.6640261Z dist init r=0, world=4 2022-11-23T03:48:19.6640488Z dist init r=1, world=4 2022-11-23T03:48:19.6640734Z dist init r=2, world=4 2022-11-23T03:48:19.6640966Z ok (4.317s) 2022-11-23T03:48:19.6641404Z test_diff_hyperparams_cpu_offload_sharding_strategy_str_full_shard (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6642092Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103434 2022-11-23T03:48:19.6642630Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103435 2022-11-23T03:48:19.6643240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6643675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6644250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6644717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6645291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6645718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6646280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6646740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6647192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6647672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6648327Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6649013Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6649510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6649979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6650453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6650933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6651277Z dist init r=0, world=2 2022-11-23T03:48:19.6651526Z dist init r=1, world=2 2022-11-23T03:48:19.6651763Z ok (5.617s) 2022-11-23T03:48:19.6652240Z test_diff_hyperparams_cpu_offload_sharding_strategy_str_no_shard (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6652882Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103585 2022-11-23T03:48:19.6653420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103586 2022-11-23T03:48:19.6654029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6654468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6655038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6655504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6656062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6656508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6657076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6657536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6657971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6658531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6659187Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6659866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6660424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6660897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6661371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6661835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6662193Z dist init r=1, world=2 2022-11-23T03:48:19.6662444Z dist init r=0, world=2 2022-11-23T03:48:19.6662682Z ok (5.617s) 2022-11-23T03:48:19.6663106Z test_diff_hyperparams_cpu_offload_sharding_strategy_str_shard_grad_op (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6663743Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103736 2022-11-23T03:48:19.6664548Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103737 2022-11-23T03:48:19.6665153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6665598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6666166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6666629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6667189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6667642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6668208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6668651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6669099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6669679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6670342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6671009Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6671528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6671993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6672550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6673011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6673372Z dist init r=0, world=2 2022-11-23T03:48:19.6673620Z dist init r=1, world=2 2022-11-23T03:48:19.6673839Z ok (5.617s) 2022-11-23T03:48:19.6674260Z test_diff_hyperparams_sharding_strategy_str_full_shard (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6674882Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103887 2022-11-23T03:48:19.6675423Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103888 2022-11-23T03:48:19.6676094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6676537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6677094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6677523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6678098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6678562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6679144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6679589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6680041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6680539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6681172Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6681851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6682373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6682843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6683299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6683778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6684258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6684730Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6685181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6685646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6686109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6686613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6687097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6687568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6688037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6688496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6688958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6689424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6689870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6690337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6690802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6691265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6691712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6692175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6692712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6693178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6693626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6694090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6694553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6695004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6695470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6695931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6696395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6696846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6697303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6697766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6698211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6698679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6699141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6699603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6700048Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6700514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6700974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6701418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6701884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6702342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6702858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6703315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6703772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6704438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6704888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6705351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6706620Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6707397Z warnings.warn( 2022-11-23T03:48:19.6708773Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6709688Z warnings.warn( 2022-11-23T03:48:19.6710039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6710513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6710990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6711442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6711914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6712380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6712847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6713297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6713763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6714223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6714758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6715233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6715693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6716154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6716599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6717064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6717524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6717985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6718434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6718892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6719436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6719897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6720359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6720818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6721285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6721733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6722192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6722653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6723097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6723567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6724028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6724487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6724934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6725458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6725918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6726361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6726825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6727290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6727751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6728193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6728653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6729118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6729565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6730026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6730487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6730946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6731397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6731858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6732206Z dist init r=0, world=2 2022-11-23T03:48:19.6732437Z dist init r=1, world=2 2022-11-23T03:48:19.6732678Z ok (35.478s) 2022-11-23T03:48:19.6733101Z test_diff_hyperparams_sharding_strategy_str_no_shard (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6733723Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104038 2022-11-23T03:48:19.6734243Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104039 2022-11-23T03:48:19.6734864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6735310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6735921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6736400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6736974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6737409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6737962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6738424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6738875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6739363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6740000Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6740686Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6741199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6741646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6742181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6742656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6743134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6743591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6744806Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6746051Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6747269Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6748490Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6749702Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6750926Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6752185Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6753419Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6754635Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6755848Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6756564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6757043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6757588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6758062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6758527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6758994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6759451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6759917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6760380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6760829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6761295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6761760Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6762221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6762669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6763131Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6763593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6764041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6764500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6764960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6765421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6765871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6766330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6766791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6767234Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6768281Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6769510Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6770734Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6771949Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6773161Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6774425Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6775640Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6776833Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6778050Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6779263Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6779983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6780465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6780924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6781395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6781865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6782333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6782836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6783310Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6783781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6784508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6785529Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6786762Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6787982Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6789299Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6790523Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6791725Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6792939Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6794142Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6795356Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6796574Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6797301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6797779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6798301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6798785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6799773Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6801006Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6802224Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6803442Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6804724Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6805914Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6807129Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6808339Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6809537Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6810765Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6811485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6811968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6812420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6812902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6813430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6813898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6815293Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6816086Z warnings.warn( 2022-11-23T03:48:19.6817242Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6818006Z warnings.warn( 2022-11-23T03:48:19.6818374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6818897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6819466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6819938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6820388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6820863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6821333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6821796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6822248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6822713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6823181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6823626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6824308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6824778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6825262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6825721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6826190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6826659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6827668Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6828919Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6830195Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6831426Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6832647Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6833867Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6835080Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6836391Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6837629Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6838835Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6839554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6840020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6840497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6840974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6841445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6841900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6842372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6842850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6843305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6843802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6844316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6844850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6845359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6845811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6846285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6846775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6847257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6847708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6848172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6848649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6849102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6849580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6850054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6850532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6851569Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6852804Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6854052Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6855278Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6856503Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6857725Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6858954Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6860220Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6861439Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6862667Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:48:19.6863394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6864158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6864671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6865134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6865618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6866102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6866544Z dist init r=1, world=2 2022-11-23T03:48:19.6866807Z dist init r=0, world=2 2022-11-23T03:48:19.6867056Z ok (34.475s) 2022-11-23T03:48:19.6867474Z test_diff_hyperparams_sharding_strategy_str_shard_grad_op (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6868113Z Tests FSDP parity with DDP when using multiple parameter groups with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104189 2022-11-23T03:48:19.6868666Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104190 2022-11-23T03:48:19.6869301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6869743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6870330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6870918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6871516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6871944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6872520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6872996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6873439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6873945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6874611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6875303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6875809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6876291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6876766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6877254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6877848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6878352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6878826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6879285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6879774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6880259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6880734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6881191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6881665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6882142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6882613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6883064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6883525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6884132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6884582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6885044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6885505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6885972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6886428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6886892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6887356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6887806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6888275Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6888737Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6889198Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6889644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6890107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6890656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6891101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6891567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6892038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6892499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6892944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6893403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6893864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6894362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6894840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6895303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6895765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6896221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6896679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6897143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6897601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6898051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6898511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6898972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6900224Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6901061Z warnings.warn( 2022-11-23T03:48:19.6902219Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:48:19.6902981Z warnings.warn( 2022-11-23T03:48:19.6903351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6903810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6904501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6904978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6905449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6905902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6906368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6906836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6907283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6907792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6908269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6908743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6909195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6909659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6910119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6910646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6911124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6911587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6912055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6912509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6912971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6913434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6913893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6914349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6915029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6915504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6915954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6916421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6916978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6917442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6917891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6918357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6918827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6919274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6919738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6920196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6920658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6921106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6921566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6922026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6922471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6922939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6923397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6923859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6924305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6924769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6925230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6925691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6926031Z dist init r=1, world=2 2022-11-23T03:48:19.6926280Z dist init r=0, world=2 2022-11-23T03:48:19.6926519Z ok (33.975s) 2022-11-23T03:48:19.6926878Z test_diff_trainability (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6927508Z Tests FSDP parity with DDP when using multiple parameter groups and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104340 2022-11-23T03:48:19.6928061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104341 2022-11-23T03:48:19.6928662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6929117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6929688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6930153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6930709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6931151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6931728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6932169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6932613Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6933101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6933810Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6934481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6935058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6935528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6935882Z dist init r=1, world=2 2022-11-23T03:48:19.6936116Z dist init r=0, world=2 2022-11-23T03:48:19.6936352Z ok (8.322s) 2022-11-23T03:48:19.6936729Z test_multiple_optimizers (__main__.TestFSDPUseOrigParamsMultipleParamGroups) 2022-11-23T03:48:19.6937286Z Tests using two optimizers where only one sets gradients to ``None``. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104491 2022-11-23T03:48:19.6937818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104492 2022-11-23T03:48:19.6938423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6938872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6939428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6939894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6940475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6940900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6941466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6941929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6942382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6942853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6943501Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6944533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6945064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6945516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6946531Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T03:48:19.6948020Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T03:48:19.6948949Z dist init r=1, world=2 2022-11-23T03:48:19.6949196Z dist init r=0, world=2 2022-11-23T03:48:19.6949432Z ok (5.316s) 2022-11-23T03:48:19.6949779Z test_access_params_after_forward (__main__.TestFSDPUseOrigParamsParamAccess) 2022-11-23T03:48:19.6950344Z Tests that accessing the original parameters after the forward but ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104642 2022-11-23T03:48:19.6950878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104643 2022-11-23T03:48:19.6951503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6951938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6952512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6952978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6953539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6953983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6954550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6955010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6955446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6955946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6956596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6957258Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6957781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6958249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6958720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6959182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6959657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6960244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6960732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6961188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.6961545Z dist init r=0, world=2 2022-11-23T03:48:19.6961806Z dist init r=1, world=2 2022-11-23T03:48:19.6962024Z ok (4.614s) 2022-11-23T03:48:19.6962406Z test_multiple_forward_offload_params_False (__main__.TestFSDPUseOrigParamsUnshardReshard) 2022-11-23T03:48:19.6962981Z Tests that ``use_orig_params=True`` has parity with ``False`` when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104793 2022-11-23T03:48:19.6963491Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104794 2022-11-23T03:48:19.6964114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6964564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6965133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6965579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6966154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6966661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6967237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6967685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6968139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6968636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6969268Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6969950Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6970473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6970940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6971274Z dist init r=0, world=2 2022-11-23T03:48:19.6971524Z dist init r=1, world=2 2022-11-23T03:48:19.6971763Z ok (5.917s) 2022-11-23T03:48:19.6972127Z test_multiple_forward_offload_params_True (__main__.TestFSDPUseOrigParamsUnshardReshard) 2022-11-23T03:48:19.6972708Z Tests that ``use_orig_params=True`` has parity with ``False`` when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104944 2022-11-23T03:48:19.6973234Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104945 2022-11-23T03:48:19.6973844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6974278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6974851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6975314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6975867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6976306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6976926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6977395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6977825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6978316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6978968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6979658Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6980163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6980631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6980981Z dist init r=0, world=2 2022-11-23T03:48:19.6981215Z dist init r=1, world=2 2022-11-23T03:48:19.6981454Z ok (6.018s) 2022-11-23T03:48:19.6981857Z test_summon_between_two_forwards_offload_params_False (__main__.TestFSDPUseOrigParamsUnshardReshard) 2022-11-23T03:48:19.6982438Z Tests that ``use_orig_params=True`` has parity with ``False`` when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105095 2022-11-23T03:48:19.6982945Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105096 2022-11-23T03:48:19.6983619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6984288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6984852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6985321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6985898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6986344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6986891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6987349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6987804Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6988280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6988925Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6989602Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6990121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.6990572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.6990916Z dist init r=0, world=2 2022-11-23T03:48:19.6991166Z dist init r=1, world=2 2022-11-23T03:48:19.6991385Z ok (6.418s) 2022-11-23T03:48:19.6991792Z test_summon_between_two_forwards_offload_params_True (__main__.TestFSDPUseOrigParamsUnshardReshard) 2022-11-23T03:48:19.6992376Z Tests that ``use_orig_params=True`` has parity with ``False`` when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105246 2022-11-23T03:48:19.6992894Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105247 2022-11-23T03:48:19.6993485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6993935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6994587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6995070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6995632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.6996079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.6996647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.6997088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.6997540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.6998034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.6998690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6999360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.6999881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.7000422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.7000778Z dist init r=0, world=2 2022-11-23T03:48:19.7001012Z dist init r=1, world=2 2022-11-23T03:48:19.7001247Z ok (6.419s) 2022-11-23T03:48:19.7001578Z test_grad_writeback (__main__.TestFSDPUseOrigParamsWriteback) 2022-11-23T03:48:19.7002241Z Tests that changes to the original parameters' gradients are written ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105397 2022-11-23T03:48:19.7002782Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105398 2022-11-23T03:48:19.7003394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7003826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7004400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7004872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7005446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7005870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7006434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7006896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7007347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.7007822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.7008467Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7009155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7009659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.7010124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.7010600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7011079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7011595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7012078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7012551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7013002Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7013475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7013938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7014402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7014940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7015420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7015890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T03:48:19.7016247Z dist init r=0, world=2 2022-11-23T03:48:19.7016479Z dist init r=1, world=2 2022-11-23T03:48:19.7016713Z ok (4.714s) 2022-11-23T03:48:19.7017045Z test_param_writeback (__main__.TestFSDPUseOrigParamsWriteback) 2022-11-23T03:48:19.7017563Z Tests that changes to the original parameters are written back. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105548 2022-11-23T03:48:19.7018155Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105549 2022-11-23T03:48:19.7018772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7019206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7019781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7020248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7020821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7021245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7021811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7022274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7022778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.7023274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.7024176Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7024887Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7025381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.7025932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.7026285Z dist init r=1, world=2 2022-11-23T03:48:19.7026536Z dist init r=0, world=2 2022-11-23T03:48:19.7026762Z ok (4.113s) 2022-11-23T03:48:19.7027230Z test_writeback_shape_mismatch (__main__.TestFSDPUseOrigParamsWriteback) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105691 2022-11-23T03:48:19.7027804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105692 2022-11-23T03:48:19.7028395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7029021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7029618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7030090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7030654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:48:19.7031106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:48:19.7031672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:48:19.7032124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:48:19.7032700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:48:19.7033197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:48:19.7033855Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7034521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:48:19.7035042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:48:19.7035609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:48:19.7035962Z dist init r=1, world=2 2022-11-23T03:48:19.7036195Z dist init r=0, world=2 2022-11-23T03:48:19.7036439Z ok (4.114s) 2022-11-23T03:48:19.7036592Z 2022-11-23T03:48:19.7036872Z ---------------------------------------------------------------------- 2022-11-23T03:48:19.7037191Z Ran 18 tests in 187.755s 2022-11-23T03:48:19.7037354Z 2022-11-23T03:48:19.7037447Z OK 2022-11-23T03:48:19.7037581Z 2022-11-23T03:48:19.7037707Z Generating XML reports... 2022-11-23T03:48:19.7038347Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsFQNs-20221123034511.xml 2022-11-23T03:48:19.7039286Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsMultipleParamGroups-20221123034511.xml 2022-11-23T03:48:19.7040223Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsParamAccess-20221123034511.xml 2022-11-23T03:48:19.7041139Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsUnshardReshard-20221123034511.xml 2022-11-23T03:48:19.7042046Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsWriteback-20221123034511.xml 2022-11-23T03:48:19.7042451Z 2022-11-23T03:48:19.7042883Z ##[endgroup] 2022-11-23T03:48:19.7043525Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_use_orig_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_use_orig_params_ttwhn5ja) 2022-11-23T03:48:19.7043900Z 2022-11-23T03:48:20.0507416Z 2022-11-23T03:48:20.0507972Z real 3m15.750s 2022-11-23T03:48:20.0508272Z user 6m32.507s 2022-11-23T03:48:20.0508507Z sys 2m34.516s 2022-11-23T03:48:20.0508800Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:48:20.0509390Z + python test/run_test.py --verbose -i distributed/fsdp/test_shard_utils.py 2022-11-23T03:48:22.4479620Z Ignoring disabled issues: [] 2022-11-23T03:48:22.5020923Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:48:22.5021507Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:48:22.5021842Z Selected tests: 2022-11-23T03:48:22.5022124Z distributed/fsdp/test_shard_utils.py 2022-11-23T03:48:22.5050114Z Prioritized test from test file changes. 2022-11-23T03:48:22.5050471Z reordering tests for PR: 2022-11-23T03:48:22.5050745Z prioritized: [] 2022-11-23T03:48:22.5051232Z the rest: ['distributed/fsdp/test_shard_utils.py'] 2022-11-23T03:48:22.5051448Z 2022-11-23T03:48:22.5051993Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:48:22.5052994Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:48:22.5059293Z parallel (file granularity) tests: 2022-11-23T03:48:22.5059597Z 2022-11-23T03:48:22.5059840Z serial (file granularity) tests: 2022-11-23T03:48:22.5060130Z distributed/fsdp/test_shard_utils.py 2022-11-23T03:48:24.8318438Z Ignoring disabled issues: [] 2022-11-23T03:48:25.2561079Z Running distributed/fsdp/test_shard_utils.py ... [2022-11-23 03:48:25.255402] 2022-11-23T03:48:25.2562739Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_shard_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:48:25.255843] 2022-11-23T03:48:27.5220970Z 2022-11-23T03:48:27.5221697Z Expand the folded group to see the log file of distributed/fsdp/test_shard_utils 2022-11-23T03:48:27.5222760Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_shard_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_shard_utils_q7t6oseg) 2022-11-23T03:48:27.5223466Z 2022-11-23T03:48:27.5223787Z ##[endgroup] 2022-11-23T03:48:27.5225294Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_shard_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_shard_utils_q7t6oseg) 2022-11-23T03:48:27.5225668Z 2022-11-23T03:48:27.8903659Z 2022-11-23T03:48:27.8904689Z real 0m7.840s 2022-11-23T03:48:27.8905053Z user 0m13.756s 2022-11-23T03:48:27.8905332Z sys 0m13.956s 2022-11-23T03:48:27.8905617Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:48:27.8906239Z + python test/run_test.py --verbose -i distributed/fsdp/test_utils.py 2022-11-23T03:48:30.2737881Z Ignoring disabled issues: [] 2022-11-23T03:48:30.3284674Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:48:30.3285282Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:48:30.3285655Z Selected tests: 2022-11-23T03:48:30.3285947Z distributed/fsdp/test_utils.py 2022-11-23T03:48:30.3311138Z Prioritized test from test file changes. 2022-11-23T03:48:30.3311772Z reordering tests for PR: 2022-11-23T03:48:30.3312101Z prioritized: [] 2022-11-23T03:48:30.3312597Z the rest: ['distributed/fsdp/test_utils.py'] 2022-11-23T03:48:30.3312809Z 2022-11-23T03:48:30.3313356Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:48:30.3314314Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:48:30.3319943Z parallel (file granularity) tests: 2022-11-23T03:48:30.3320220Z 2022-11-23T03:48:30.3320476Z serial (file granularity) tests: 2022-11-23T03:48:30.3320789Z distributed/fsdp/test_utils.py 2022-11-23T03:48:32.6345258Z Ignoring disabled issues: [] 2022-11-23T03:48:33.0398236Z Running distributed/fsdp/test_utils.py ... [2022-11-23 03:48:33.039169] 2022-11-23T03:48:33.0399745Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:48:33.039623] 2022-11-23T03:48:37.2474313Z 2022-11-23T03:48:37.2475021Z Expand the folded group to see the log file of distributed/fsdp/test_utils 2022-11-23T03:48:37.2476234Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_utils_kyxdsod0) 2022-11-23T03:48:37.2476602Z 2022-11-23T03:48:37.2476715Z Running tests... 2022-11-23T03:48:37.2477254Z ---------------------------------------------------------------------- 2022-11-23T03:48:37.2477787Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_utils 2022-11-23T03:48:37.2478239Z test_module_wrap_policy (__main__.TestGetSubmoduleToStates) 2022-11-23T03:48:37.2478663Z Tests the module wrap policy on a nested model with buffers and a ... ok (1.762s) 2022-11-23T03:48:37.2479049Z test_apply_to_tensors_cpu_cuda (__main__.TestUtils) ... ok (0.004s) 2022-11-23T03:48:37.2479523Z test_apply_to_tensors_devices_['cpu'] (__main__.TestUtils) ... ok (0.003s) 2022-11-23T03:48:37.2480028Z test_apply_to_tensors_devices_['cuda'] (__main__.TestUtils) ... ok (0.004s) 2022-11-23T03:48:37.2480389Z test_packed_sequence (__main__.TestUtils) 2022-11-23T03:48:37.2480744Z Test to ensure RNN packed sequences are modified correctly. ... ok (0.003s) 2022-11-23T03:48:37.2481129Z test_replace_by_prefix (__main__.TestUtils) ... ok (0.001s) 2022-11-23T03:48:37.2481337Z 2022-11-23T03:48:37.2481607Z ---------------------------------------------------------------------- 2022-11-23T03:48:37.2481922Z Ran 6 tests in 1.778s 2022-11-23T03:48:37.2482083Z 2022-11-23T03:48:37.2482175Z OK 2022-11-23T03:48:37.2482309Z 2022-11-23T03:48:37.2482548Z Generating XML reports... 2022-11-23T03:48:37.2483175Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestGetSubmoduleToStates-20221123034835.xml 2022-11-23T03:48:37.2483968Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestUtils-20221123034835.xml 2022-11-23T03:48:37.2484289Z 2022-11-23T03:48:37.2484595Z ##[endgroup] 2022-11-23T03:48:37.2485166Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_utils_kyxdsod0) 2022-11-23T03:48:37.2485499Z 2022-11-23T03:48:37.6470015Z 2022-11-23T03:48:37.6470649Z real 0m9.757s 2022-11-23T03:48:37.6471033Z user 0m15.633s 2022-11-23T03:48:37.6471292Z sys 0m14.651s 2022-11-23T03:48:37.6471596Z + for f in test/distributed/fsdp/*.py 2022-11-23T03:48:37.6472185Z + python test/run_test.py --verbose -i distributed/fsdp/test_wrap.py 2022-11-23T03:48:40.0421924Z Ignoring disabled issues: [] 2022-11-23T03:48:40.0973921Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:48:40.0974495Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:48:40.0974856Z Selected tests: 2022-11-23T03:48:40.0975139Z distributed/fsdp/test_wrap.py 2022-11-23T03:48:40.1004764Z Prioritized test from test file changes. 2022-11-23T03:48:40.1005102Z reordering tests for PR: 2022-11-23T03:48:40.1005367Z prioritized: [] 2022-11-23T03:48:40.1005828Z the rest: ['distributed/fsdp/test_wrap.py'] 2022-11-23T03:48:40.1006048Z 2022-11-23T03:48:40.1006594Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:48:40.1007533Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:48:40.1014012Z parallel (file granularity) tests: 2022-11-23T03:48:40.1014316Z 2022-11-23T03:48:40.1014565Z serial (file granularity) tests: 2022-11-23T03:48:40.1014860Z distributed/fsdp/test_wrap.py 2022-11-23T03:48:42.4108197Z Ignoring disabled issues: [] 2022-11-23T03:48:42.8144514Z Running distributed/fsdp/test_wrap.py ... [2022-11-23 03:48:42.813715] 2022-11-23T03:48:42.8145542Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_wrap.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:48:42.814213] 2022-11-23T03:50:34.9580997Z 2022-11-23T03:50:34.9585702Z Expand the folded group to see the log file of distributed/fsdp/test_wrap 2022-11-23T03:50:34.9586918Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_wrap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_wrap_smb40d5d) 2022-11-23T03:50:34.9600966Z 2022-11-23T03:50:34.9601285Z Running tests... 2022-11-23T03:50:34.9603214Z ---------------------------------------------------------------------- 2022-11-23T03:50:34.9603829Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_wrap 2022-11-23T03:50:34.9604245Z test_always_wrap (__main__.TestAutoWrap) 2022-11-23T03:50:34.9604655Z Test to ensure that if `always_wrap_policy` is ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:50:34.9604988Z ok (1.717s) 2022-11-23T03:50:34.9606424Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9607351Z warnings.warn( 2022-11-23T03:50:34.9607788Z ok (0.005s) 2022-11-23T03:50:34.9608173Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.005s) 2022-11-23T03:50:34.9608573Z test_auto_wrap_api (__main__.TestAutoWrap) 2022-11-23T03:50:34.9609000Z Test to ensure with auto wrap, we wrap child modules correctly based on the min_num_params. ... ok (0.003s) 2022-11-23T03:50:34.9609434Z test_auto_wrap_preset_exclude_wrap (__main__.TestAutoWrap) 2022-11-23T03:50:34.9609876Z Test to ensure excluded modules are not wrapped, regardless if the total param size is greater than the ... ok (0.002s) 2022-11-23T03:50:34.9610357Z test_auto_wrap_preset_exclude_wrap_include_children (__main__.TestAutoWrap) 2022-11-23T03:50:34.9610836Z Test to ensure excluded modules are not wrapped, but children are if param size is greater than ... ok (0.002s) 2022-11-23T03:50:34.9611297Z test_auto_wrap_preset_force_leaf (__main__.TestAutoWrap) 2022-11-23T03:50:34.9611821Z Test to ensure force-leaf modules are not wrapped, and children are not wrapped. The ... ok (0.004s) 2022-11-23T03:50:34.9612290Z test_auto_wrap_preset_force_leaf_custom (__main__.TestAutoWrap) 2022-11-23T03:50:34.9612753Z Test to ensure force-leaf modules are not wrapped. ... ok (0.002s) 2022-11-23T03:50:34.9613420Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9614277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9614656Z ok (0.490s) 2022-11-23T03:50:34.9615226Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9616069Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9616458Z ok (0.046s) 2022-11-23T03:50:34.9616886Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9617526Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9618370Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9619237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9620267Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9621505Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9622100Z ok (0.065s) 2022-11-23T03:50:34.9622693Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9623610Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9624434Z ok (0.044s) 2022-11-23T03:50:34.9625075Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9625928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9626326Z ok (0.045s) 2022-11-23T03:50:34.9626873Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9627709Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T03:50:34.9628763Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9629983Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9631204Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9632415Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T03:50:34.9633008Z ok (0.079s) 2022-11-23T03:50:34.9633377Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... ok (0.003s) 2022-11-23T03:50:34.9633967Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.003s) 2022-11-23T03:50:34.9634400Z test_module_wrap_policy (__main__.TestAutoWrap) 2022-11-23T03:50:34.9634728Z Tests the ``ModuleWrapPolicy``. ... ok (0.027s) 2022-11-23T03:50:34.9635094Z test_transformer_auto_wrap_policy (__main__.TestAutoWrap) 2022-11-23T03:50:34.9635461Z Tests the ``transformer_auto_wrap_policy``. ... ok (0.020s) 2022-11-23T03:50:34.9636223Z test_wrap_disabled_outside_context (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9636612Z test_wrap_override_defaults (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9637026Z test_wrap_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9637464Z test_wrap_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.002s) 2022-11-23T03:50:34.9637852Z test_bn_always_wrapped_individually (__main__.TestFSDPWrap) 2022-11-23T03:50:34.9638385Z Ensures that by using _or_policy with _wrap_batchnorm_individually, even ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106501 2022-11-23T03:50:34.9638933Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106502 2022-11-23T03:50:34.9639382Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 106503 2022-11-23T03:50:34.9639893Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 106504 2022-11-23T03:50:34.9640510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9640966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9641519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9642002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9642588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9643049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9643604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9644077Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9644662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9645105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9645652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9646116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9646698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9647123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9647695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9648156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9648621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9649104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9649592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9650080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9650786Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9651515Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9652201Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9652882Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9653380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9653856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9654314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9654778Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9656032Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9656867Z warnings.warn( 2022-11-23T03:50:34.9658025Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9658784Z warnings.warn( 2022-11-23T03:50:34.9659931Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9660700Z warnings.warn( 2022-11-23T03:50:34.9661824Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9662588Z warnings.warn( 2022-11-23T03:50:34.9662836Z dist init r=3, world=4 2022-11-23T03:50:34.9663087Z dist init r=0, world=4 2022-11-23T03:50:34.9663316Z dist init r=1, world=4 2022-11-23T03:50:34.9663568Z dist init r=2, world=4 2022-11-23T03:50:34.9663804Z ok (4.520s) 2022-11-23T03:50:34.9664678Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-11-23T03:50:34.9665271Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106786 2022-11-23T03:50:34.9665807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106787 2022-11-23T03:50:34.9666254Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 106788 2022-11-23T03:50:34.9666679Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 106789 2022-11-23T03:50:34.9667394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9667855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9668415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9668889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9669472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9669921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9670473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9670944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9671518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9671959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9672507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9672970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9673536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9674036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9674679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9675144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9675593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9676074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9676565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9677061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9677689Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9678376Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9679059Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9679736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9680239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9680709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9681168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9681699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9682947Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9683735Z warnings.warn( 2022-11-23T03:50:34.9684947Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9685722Z warnings.warn( 2022-11-23T03:50:34.9686867Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9687637Z warnings.warn( 2022-11-23T03:50:34.9688818Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9689722Z warnings.warn( 2022-11-23T03:50:34.9689981Z dist init r=3, world=4 2022-11-23T03:50:34.9690317Z dist init r=0, world=4 2022-11-23T03:50:34.9690557Z dist init r=2, world=4 2022-11-23T03:50:34.9690827Z dist init r=1, world=4 2022-11-23T03:50:34.9691093Z ok (4.317s) 2022-11-23T03:50:34.9691477Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-11-23T03:50:34.9692094Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107071 2022-11-23T03:50:34.9692664Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107072 2022-11-23T03:50:34.9693145Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107073 2022-11-23T03:50:34.9693608Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107074 2022-11-23T03:50:34.9694261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9694736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9695320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9695821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9696433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9696910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9697497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9697992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9698601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9699078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9699662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9700161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9700773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9701299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9701914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9702404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9702879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9703385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9704206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9704725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9705418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9706112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9706796Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9707473Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9708070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9708546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9709011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9709495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9709840Z dist init r=3, world=4 2022-11-23T03:50:34.9710099Z dist init r=0, world=4 2022-11-23T03:50:34.9710353Z dist init r=2, world=4 2022-11-23T03:50:34.9710581Z dist init r=1, world=4 2022-11-23T03:50:34.9710823Z ok (4.317s) 2022-11-23T03:50:34.9711199Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-11-23T03:50:34.9711758Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107356 2022-11-23T03:50:34.9712299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107357 2022-11-23T03:50:34.9712746Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107358 2022-11-23T03:50:34.9713193Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107359 2022-11-23T03:50:34.9713788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9714236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9714793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9715248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9715806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9716257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9716817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9717264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9717841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9718295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9718943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9719398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9719976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9720415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9720965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9721430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9721879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9722371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9722841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9723333Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9723992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9724744Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9725481Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9726173Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9726703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9727190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9727644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9728110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9729378Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9730170Z warnings.warn( 2022-11-23T03:50:34.9731302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9732076Z warnings.warn( 2022-11-23T03:50:34.9733234Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9733998Z warnings.warn( 2022-11-23T03:50:34.9735193Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9735958Z warnings.warn( 2022-11-23T03:50:34.9736188Z dist init r=0, world=4 2022-11-23T03:50:34.9736444Z dist init r=2, world=4 2022-11-23T03:50:34.9736690Z dist init r=1, world=4 2022-11-23T03:50:34.9736918Z dist init r=3, world=4 2022-11-23T03:50:34.9737154Z ok (4.418s) 2022-11-23T03:50:34.9737536Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-11-23T03:50:34.9738090Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107641 2022-11-23T03:50:34.9738628Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107642 2022-11-23T03:50:34.9739080Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107643 2022-11-23T03:50:34.9739527Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107644 2022-11-23T03:50:34.9740116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9740566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9741200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9741647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9742220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9742662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9743233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9743681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9744529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9744970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9745546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9745989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9746559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9747003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9747549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9748008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9748457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9748951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9749422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9749913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9750564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9751254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9751994Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9752688Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9753206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9753659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9754125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9754581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9754939Z dist init r=3, world=4 2022-11-23T03:50:34.9755175Z dist init r=0, world=4 2022-11-23T03:50:34.9755425Z dist init r=1, world=4 2022-11-23T03:50:34.9755673Z dist init r=2, world=4 2022-11-23T03:50:34.9755891Z ok (4.418s) 2022-11-23T03:50:34.9756526Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107926 2022-11-23T03:50:34.9757240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107927 2022-11-23T03:50:34.9757687Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 107928 2022-11-23T03:50:34.9758185Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 107929 2022-11-23T03:50:34.9758797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9759249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9759804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9760275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9760850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9761295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9761843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9762312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9762889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9763331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9763879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9764340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9764914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9765334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9765902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9766364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9766813Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9767290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9767777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9768260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9768940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9769635Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9770315Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9770999Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9771493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9771963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9772421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9772888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9774125Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9774952Z warnings.warn( 2022-11-23T03:50:34.9776090Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9776862Z warnings.warn( 2022-11-23T03:50:34.9778006Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9778764Z warnings.warn( 2022-11-23T03:50:34.9779889Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9780648Z warnings.warn( 2022-11-23T03:50:34.9780896Z dist init r=3, world=4 2022-11-23T03:50:34.9781148Z dist init r=0, world=4 2022-11-23T03:50:34.9781375Z dist init r=1, world=4 2022-11-23T03:50:34.9781621Z dist init r=2, world=4 2022-11-23T03:50:34.9781857Z ok (4.921s) 2022-11-23T03:50:34.9782474Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108227 2022-11-23T03:50:34.9783184Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108228 2022-11-23T03:50:34.9783634Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108229 2022-11-23T03:50:34.9784282Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108230 2022-11-23T03:50:34.9784961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9785423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9785996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9786469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9787023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9787468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9788035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9788479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9789056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9789501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9790064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9790506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9791156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9791596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9792140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9792598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9793055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9793551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9794018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9794501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9795152Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9795832Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9796495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9797172Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9797693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9798166Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9798611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9799077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9799431Z dist init r=3, world=4 2022-11-23T03:50:34.9799665Z dist init r=1, world=4 2022-11-23T03:50:34.9799914Z dist init r=0, world=4 2022-11-23T03:50:34.9800197Z dist init r=2, world=4 2022-11-23T03:50:34.9800414Z ok (4.921s) 2022-11-23T03:50:34.9801090Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108528 2022-11-23T03:50:34.9801799Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108529 2022-11-23T03:50:34.9802249Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108530 2022-11-23T03:50:34.9802672Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108531 2022-11-23T03:50:34.9803281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9803734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9804287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9804757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9805386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9805835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9806387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9806851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9807419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9807988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9808539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9809000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9809568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9809995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9810565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9811025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9811477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9811958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9812445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9812935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9813586Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9814261Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9814936Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9815609Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9816126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9816580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9817037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9817500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9818811Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9819593Z warnings.warn( 2022-11-23T03:50:34.9820744Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9821511Z warnings.warn( 2022-11-23T03:50:34.9822644Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9823475Z warnings.warn( 2022-11-23T03:50:34.9824971Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9825730Z warnings.warn( 2022-11-23T03:50:34.9825979Z dist init r=0, world=4 2022-11-23T03:50:34.9826233Z dist init r=1, world=4 2022-11-23T03:50:34.9826462Z dist init r=2, world=4 2022-11-23T03:50:34.9826706Z dist init r=3, world=4 2022-11-23T03:50:34.9826943Z ok (4.921s) 2022-11-23T03:50:34.9827555Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108829 2022-11-23T03:50:34.9828274Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108830 2022-11-23T03:50:34.9828721Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108831 2022-11-23T03:50:34.9829166Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108832 2022-11-23T03:50:34.9829761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9830214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9830791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9831263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9831820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9832268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9832843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9833290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9833870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9834315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9834964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9835418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9835994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9836439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9837005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9837446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9837900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9838394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9838917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9839487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9840138Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9840906Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9841569Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9842255Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9842771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9843248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9843694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9844150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9844503Z dist init r=3, world=4 2022-11-23T03:50:34.9844736Z dist init r=1, world=4 2022-11-23T03:50:34.9844989Z dist init r=2, world=4 2022-11-23T03:50:34.9845236Z dist init r=0, world=4 2022-11-23T03:50:34.9845453Z ok (4.920s) 2022-11-23T03:50:34.9846080Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109130 2022-11-23T03:50:34.9846792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109131 2022-11-23T03:50:34.9847244Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109132 2022-11-23T03:50:34.9847667Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109133 2022-11-23T03:50:34.9848281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9848731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9849309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9849757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9850333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9850778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9851379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9851854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9852431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9852873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9853411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9853860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9854426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9854873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9855451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9855915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9856369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9856842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9857332Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9857917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9858569Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9859230Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9859913Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9860590Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9861107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9861560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9862023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9862485Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9863752Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9864775Z warnings.warn( 2022-11-23T03:50:34.9865937Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9866703Z warnings.warn( 2022-11-23T03:50:34.9867922Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9868696Z warnings.warn( 2022-11-23T03:50:34.9869850Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9870585Z warnings.warn( 2022-11-23T03:50:34.9870833Z dist init r=0, world=4 2022-11-23T03:50:34.9871087Z dist init r=2, world=4 2022-11-23T03:50:34.9871314Z dist init r=3, world=4 2022-11-23T03:50:34.9871561Z dist init r=1, world=4 2022-11-23T03:50:34.9871794Z ok (4.920s) 2022-11-23T03:50:34.9872406Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109431 2022-11-23T03:50:34.9873122Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109432 2022-11-23T03:50:34.9873646Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109433 2022-11-23T03:50:34.9874092Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109434 2022-11-23T03:50:34.9874686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9875137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9875708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9876176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9876743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9877166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9877733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9878193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9878758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9879178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9879738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9880197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9880744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9881179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9881742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9882199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9882626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9883116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9883598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9884120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9884782Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9885462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9886132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9886788Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9887296Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9887762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9888223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9888670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9889015Z dist init r=3, world=4 2022-11-23T03:50:34.9889265Z dist init r=0, world=4 2022-11-23T03:50:34.9889494Z dist init r=1, world=4 2022-11-23T03:50:34.9889737Z dist init r=2, world=4 2022-11-23T03:50:34.9889970Z ok (4.921s) 2022-11-23T03:50:34.9890577Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109732 2022-11-23T03:50:34.9891345Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109733 2022-11-23T03:50:34.9891789Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109734 2022-11-23T03:50:34.9892227Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109735 2022-11-23T03:50:34.9892822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9893272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9893843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9894312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9894875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9895319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9895883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9896326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9896902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9897339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9897902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9898340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9898912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9899347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9899910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9900349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9900905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9901446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9901925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9902405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9903054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9903733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9904659Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9905372Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9905890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9906359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9906807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9907260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9908611Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9909370Z warnings.warn( 2022-11-23T03:50:34.9910520Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9911271Z warnings.warn( 2022-11-23T03:50:34.9912414Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9913170Z warnings.warn( 2022-11-23T03:50:34.9914307Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:34.9915066Z warnings.warn( 2022-11-23T03:50:34.9915295Z dist init r=3, world=4 2022-11-23T03:50:34.9915540Z dist init r=0, world=4 2022-11-23T03:50:34.9915786Z dist init r=2, world=4 2022-11-23T03:50:34.9916011Z dist init r=1, world=4 2022-11-23T03:50:34.9916243Z ok (4.921s) 2022-11-23T03:50:34.9916927Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110033 2022-11-23T03:50:34.9917643Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110034 2022-11-23T03:50:34.9918068Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110035 2022-11-23T03:50:34.9918505Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110036 2022-11-23T03:50:34.9919198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9919643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9920195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9920658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9921234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9921659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9922222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9922685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9923316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9923733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9924295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9924806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9925363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9925803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9926367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9926822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9927251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9927744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9928224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9928710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9929344Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9930029Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9930699Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9931348Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9931860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9932324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9932782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9933225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9933567Z dist init r=3, world=4 2022-11-23T03:50:34.9933817Z dist init r=0, world=4 2022-11-23T03:50:34.9934112Z dist init r=1, world=4 2022-11-23T03:50:34.9934382Z dist init r=2, world=4 2022-11-23T03:50:34.9934629Z ok (4.920s) 2022-11-23T03:50:34.9935294Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110334 2022-11-23T03:50:34.9936045Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110335 2022-11-23T03:50:34.9936512Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110336 2022-11-23T03:50:34.9936982Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110337 2022-11-23T03:50:34.9937604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9938076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9938677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9939167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9939759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9940295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9940899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9941389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9941977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9942450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9943048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9943522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9944392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9944842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9945421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9945869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9946319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9946806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9947281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9947761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9948406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9949087Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9949754Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9950424Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9950943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9951491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9951946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9952400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9952751Z dist init r=3, world=4 2022-11-23T03:50:34.9952983Z dist init r=0, world=4 2022-11-23T03:50:34.9953229Z dist init r=2, world=4 2022-11-23T03:50:34.9953475Z dist init r=1, world=4 2022-11-23T03:50:34.9953690Z ok (4.320s) 2022-11-23T03:50:34.9954320Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110619 2022-11-23T03:50:34.9955030Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110620 2022-11-23T03:50:34.9955475Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110621 2022-11-23T03:50:34.9955895Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110622 2022-11-23T03:50:34.9956504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9956952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9957521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9958049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9958625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9959064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9959629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9960074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9960640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9961078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9961623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9962086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9962653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9963088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9963632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9964092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9964660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9965133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9965617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9966099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9966746Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9967410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9968088Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9968817Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9969334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9969781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9970229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9970696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9971029Z dist init r=1, world=4 2022-11-23T03:50:34.9971277Z dist init r=2, world=4 2022-11-23T03:50:34.9971522Z dist init r=3, world=4 2022-11-23T03:50:34.9971765Z dist init r=0, world=4 2022-11-23T03:50:34.9971981Z ok (4.920s) 2022-11-23T03:50:34.9972607Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110920 2022-11-23T03:50:34.9973312Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110921 2022-11-23T03:50:34.9973742Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110922 2022-11-23T03:50:34.9974180Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110923 2022-11-23T03:50:34.9974916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9975360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9975913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9976375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9976950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9977372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9977939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9978393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9978967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9979388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9979957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9980414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9980986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9981404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9981964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9982418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9982852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:34.9983340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:34.9983822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:34.9984562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:34.9985279Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9985971Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9986644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9987305Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:34.9987800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:34.9988263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:34.9988720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:34.9989160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:34.9989643Z dist init r=3, world=4 2022-11-23T03:50:34.9989897Z dist init r=0, world=4 2022-11-23T03:50:34.9990140Z dist init r=1, world=4 2022-11-23T03:50:34.9990366Z dist init r=2, world=4 2022-11-23T03:50:34.9990598Z ok (4.419s) 2022-11-23T03:50:34.9991220Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111205 2022-11-23T03:50:34.9991985Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111206 2022-11-23T03:50:34.9992426Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111207 2022-11-23T03:50:34.9992862Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111208 2022-11-23T03:50:34.9993470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9993905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9994475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9994937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9995492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9995934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9996495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9996953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9997503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9997949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:34.9998512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:34.9998967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:34.9999518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:34.9999961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0000522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0000962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0001411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0001898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0002441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0002913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0003559Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0004238Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0004909Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0005624Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0006140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0006608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0007050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0007512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0007859Z dist init r=2, world=4 2022-11-23T03:50:35.0008106Z dist init r=1, world=4 2022-11-23T03:50:35.0008396Z dist init r=0, world=4 2022-11-23T03:50:35.0008643Z dist init r=3, world=4 2022-11-23T03:50:35.0008874Z ok (4.920s) 2022-11-23T03:50:35.0009486Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111506 2022-11-23T03:50:35.0010189Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111507 2022-11-23T03:50:35.0010634Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111508 2022-11-23T03:50:35.0011070Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111509 2022-11-23T03:50:35.0011661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0012106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0012675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0013122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0013695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0014144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0014719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0015167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0015744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0016192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0016767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0017215Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0017779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0018213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0018758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0019274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0019729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0020231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0020698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0021190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0021838Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0022504Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0023188Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0024183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0024762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0025216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0025763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0026216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0026568Z dist init r=3, world=4 2022-11-23T03:50:35.0026801Z dist init r=0, world=4 2022-11-23T03:50:35.0027046Z dist init r=1, world=4 2022-11-23T03:50:35.0027291Z dist init r=2, world=4 2022-11-23T03:50:35.0027505Z ok (4.319s) 2022-11-23T03:50:35.0028135Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111791 2022-11-23T03:50:35.0028844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111792 2022-11-23T03:50:35.0029290Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111793 2022-11-23T03:50:35.0029719Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111794 2022-11-23T03:50:35.0030339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0030781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0031327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0031793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0032364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0032803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0033350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0033814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0034384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0034803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0035367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0035826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0036506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0036937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0037506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0037965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0038418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0038891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0039374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0039853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0040485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0041167Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0041842Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0042583Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0043080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0043545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0043997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0044459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0044798Z dist init r=2, world=4 2022-11-23T03:50:35.0045050Z dist init r=3, world=4 2022-11-23T03:50:35.0045294Z dist init r=0, world=4 2022-11-23T03:50:35.0045524Z dist init r=1, world=4 2022-11-23T03:50:35.0045755Z ok (4.921s) 2022-11-23T03:50:35.0046376Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112092 2022-11-23T03:50:35.0047077Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112093 2022-11-23T03:50:35.0047505Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112094 2022-11-23T03:50:35.0047943Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112095 2022-11-23T03:50:35.0048558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0048992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0049562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0050024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0050591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0051013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0051575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0052030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0052585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0053081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0053647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0054097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0054645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0055088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0055648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0056098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0056526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0057022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0057502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0057968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0058616Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0059358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0060027Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0060683Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0061198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0061662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0062121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0062565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0062909Z dist init r=3, world=4 2022-11-23T03:50:35.0063166Z dist init r=0, world=4 2022-11-23T03:50:35.0063393Z dist init r=1, world=4 2022-11-23T03:50:35.0063635Z dist init r=2, world=4 2022-11-23T03:50:35.0064131Z ok (4.319s) 2022-11-23T03:50:35.0064758Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112377 2022-11-23T03:50:35.0065472Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112378 2022-11-23T03:50:35.0065922Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112379 2022-11-23T03:50:35.0066371Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112380 2022-11-23T03:50:35.0066972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0067429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0068006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0068483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0069046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0069498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0070147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0070610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0071194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0071633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0072202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0072645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0073214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0073651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0074204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0074744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0075208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0075707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0076300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0076790Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0077447Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0078137Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0078808Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0079493Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0080020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0080494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0080950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0081414Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0081774Z dist init r=3, world=4 2022-11-23T03:50:35.0082013Z dist init r=1, world=4 2022-11-23T03:50:35.0082266Z dist init r=2, world=4 2022-11-23T03:50:35.0082520Z dist init r=0, world=4 2022-11-23T03:50:35.0082740Z ok (4.921s) 2022-11-23T03:50:35.0083211Z test_wrap_batchnorm_individually_use_or_policy_False (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112678 2022-11-23T03:50:35.0083762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112679 2022-11-23T03:50:35.0084208Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112680 2022-11-23T03:50:35.0084632Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112681 2022-11-23T03:50:35.0085255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0085713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0086264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0086741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0087371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0087826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0088382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0088854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0089437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0089884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0090432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0090898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0091480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0091909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0092482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0092950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0093467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0093948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0094440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0094928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0095565Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0096253Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0096935Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0097621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0098125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0098602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0099063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0099532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0100788Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0101576Z warnings.warn( 2022-11-23T03:50:35.0102734Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0103502Z warnings.warn( 2022-11-23T03:50:35.0104977Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0105813Z warnings.warn( 2022-11-23T03:50:35.0106952Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0107713Z warnings.warn( 2022-11-23T03:50:35.0107958Z dist init r=3, world=4 2022-11-23T03:50:35.0108204Z dist init r=0, world=4 2022-11-23T03:50:35.0108433Z dist init r=2, world=4 2022-11-23T03:50:35.0108677Z dist init r=1, world=4 2022-11-23T03:50:35.0108909Z ok (4.317s) 2022-11-23T03:50:35.0109344Z test_wrap_batchnorm_individually_use_or_policy_True (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112963 2022-11-23T03:50:35.0109976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112964 2022-11-23T03:50:35.0110416Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112965 2022-11-23T03:50:35.0110857Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112966 2022-11-23T03:50:35.0111450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0111900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0112471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0112917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0113489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0113931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0114491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0114935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0115503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0115942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0116493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0116954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0117520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:50:35.0117961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:50:35.0118509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:50:35.0118964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:50:35.0119409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:50:35.0119900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:50:35.0120460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:50:35.0120948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:50:35.0121599Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0122266Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0122949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0123692Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:50:35.0124207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:50:35.0124663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:50:35.0125149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:50:35.0125611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:50:35.0126869Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0127709Z warnings.warn( 2022-11-23T03:50:35.0128858Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0129621Z warnings.warn( 2022-11-23T03:50:35.0130770Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0131520Z warnings.warn( 2022-11-23T03:50:35.0132654Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T03:50:35.0133407Z warnings.warn( 2022-11-23T03:50:35.0133638Z dist init r=3, world=4 2022-11-23T03:50:35.0133882Z dist init r=0, world=4 2022-11-23T03:50:35.0134128Z dist init r=1, world=4 2022-11-23T03:50:35.0134354Z dist init r=2, world=4 2022-11-23T03:50:35.0134585Z ok (4.318s) 2022-11-23T03:50:35.0134736Z 2022-11-23T03:50:35.0135006Z ---------------------------------------------------------------------- 2022-11-23T03:50:35.0135325Z Ran 47 tests in 109.629s 2022-11-23T03:50:35.0135488Z 2022-11-23T03:50:35.0135579Z OK 2022-11-23T03:50:35.0135711Z 2022-11-23T03:50:35.0135836Z Generating XML reports... 2022-11-23T03:50:35.0136452Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20221123034844.xml 2022-11-23T03:50:35.0137167Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20221123034844.xml 2022-11-23T03:50:35.0137488Z 2022-11-23T03:50:35.0137946Z ##[endgroup] 2022-11-23T03:50:35.0138490Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_wrap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_wrap_smb40d5d) 2022-11-23T03:50:35.0138827Z 2022-11-23T03:50:35.3934380Z 2022-11-23T03:50:35.3934872Z real 1m57.746s 2022-11-23T03:50:35.3935281Z user 6m20.909s 2022-11-23T03:50:35.3935522Z sys 4m9.905s 2022-11-23T03:50:35.3936089Z + python test/run_test.py --verbose -i distributed/checkpoint/test_checkpoint 2022-11-23T03:50:37.8161966Z Ignoring disabled issues: [] 2022-11-23T03:50:37.8693613Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:50:37.8694197Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:50:37.8694554Z Selected tests: 2022-11-23T03:50:37.8694828Z distributed/checkpoint/test_checkpoint 2022-11-23T03:50:37.8720170Z Prioritized test from test file changes. 2022-11-23T03:50:37.8720627Z reordering tests for PR: 2022-11-23T03:50:37.8721196Z prioritized: [] 2022-11-23T03:50:37.8722048Z the rest: ['distributed/checkpoint/test_checkpoint'] 2022-11-23T03:50:37.8722605Z 2022-11-23T03:50:37.8723149Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:50:37.8724086Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:50:37.8728503Z parallel (file granularity) tests: 2022-11-23T03:50:37.8728924Z 2022-11-23T03:50:37.8729309Z serial (file granularity) tests: 2022-11-23T03:50:37.8729622Z distributed/checkpoint/test_checkpoint 2022-11-23T03:50:40.2049094Z Ignoring disabled issues: [] 2022-11-23T03:50:40.6291620Z Running distributed/checkpoint/test_checkpoint ... [2022-11-23 03:50:40.628627] 2022-11-23T03:50:40.6293128Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:50:40.629070] 2022-11-23T03:51:16.0234226Z 2022-11-23T03:51:16.0235326Z Expand the folded group to see the log file of distributed/checkpoint/test_checkpoint 2022-11-23T03:51:16.0236314Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_checkpoint_h3d_1h7o) 2022-11-23T03:51:16.0236682Z 2022-11-23T03:51:16.0240264Z Running tests... 2022-11-23T03:51:16.0240871Z ---------------------------------------------------------------------- 2022-11-23T03:51:16.0241497Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_checkpoint 2022-11-23T03:51:16.0242040Z test_default_metadata (__main__.TestDistributedCheckpointing) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:51:16.0242525Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113460 2022-11-23T03:51:16.0242986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113461 2022-11-23T03:51:16.0243616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0244081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0244635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0245116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0246003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0246458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0249848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0250340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0250821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0251325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0251805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0252302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0253057Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:16.0253758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:16.0254141Z ok (5.757s) 2022-11-23T03:51:16.0254618Z test_tensor_metadata_with_missing_rank_spec (__main__.TestDistributedCheckpointing) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113603 2022-11-23T03:51:16.0255410Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113604 2022-11-23T03:51:16.0256026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0256465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0257038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0258870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0259489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0259923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0260542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0261018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0261470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0261930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0262429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0262921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0263581Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:16.0264718Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:16.0265119Z ok (4.113s) 2022-11-23T03:51:16.0265558Z test_dummy_reader_works (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113746 2022-11-23T03:51:16.0266090Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113747 2022-11-23T03:51:16.0266520Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 113748 2022-11-23T03:51:16.0266970Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 113749 2022-11-23T03:51:16.0267899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0268458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0269054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0269523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0270098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0270530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0271256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0271704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0272258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0272666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0273217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0273660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0274193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0274781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0275438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0275893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0276306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0276775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0277243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0277683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0278166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0278655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:51:16.0279143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:51:16.0279607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0280413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0281072Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0281731Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0282365Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0282740Z ok (4.417s) 2022-11-23T03:51:16.0283159Z test_dummy_writer_works (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114039 2022-11-23T03:51:16.0283654Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114040 2022-11-23T03:51:16.0284079Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114041 2022-11-23T03:51:16.0284502Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114042 2022-11-23T03:51:16.0285082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0285555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0286290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0286755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0287329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0287757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0288326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0288785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0289335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0289769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0290333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0290789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0291341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0291784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0292453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0292896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0293328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0293798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0294289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0294754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0295232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:51:16.0295709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0296183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:51:16.0296772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0297432Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0298097Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0298934Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0299585Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0299958Z ok (4.517s) 2022-11-23T03:51:16.0300540Z test_load_error_handling (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114332 2022-11-23T03:51:16.0301078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114333 2022-11-23T03:51:16.0301525Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114334 2022-11-23T03:51:16.0301949Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114335 2022-11-23T03:51:16.0302552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0303060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0303642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0304462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0305039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0305489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0306039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0306501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0307072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0307508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0308394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0308853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0309416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0309946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0310492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0310947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0311378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0311998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0312650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0313120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0313576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0314039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:51:16.0314597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:51:16.0315081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0315886Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0316546Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0317202Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0317854Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0318207Z ok (4.719s) 2022-11-23T03:51:16.0318637Z test_load_error_handling_no_dist (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114635 2022-11-23T03:51:16.0319161Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114636 2022-11-23T03:51:16.0319588Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114637 2022-11-23T03:51:16.0319994Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114638 2022-11-23T03:51:16.0320577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0321090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0321638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0322087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0322638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0323069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0323602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0324044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0324590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0325014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0325541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0325986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0326532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0327005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0327549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0327991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0328407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0328840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0329280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0329727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0330035Z ok (2.412s) 2022-11-23T03:51:16.0330456Z test_save_error_handling (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114899 2022-11-23T03:51:16.0330968Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114900 2022-11-23T03:51:16.0331576Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114901 2022-11-23T03:51:16.0332101Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114902 2022-11-23T03:51:16.0332713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0333152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0333707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0334330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0334885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0335313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0335846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0336290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0336837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0337259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0338029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0338497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0339066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0339483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0340048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0340505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0341089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0341537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:16.0341994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0342628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0343064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0343547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:51:16.0344287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:16.0344852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:51:16.0345499Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0346181Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0346863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0347535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:51:16.0348060Z ok (4.618s) 2022-11-23T03:51:16.0348490Z test_save_error_handling_no_dist (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115198 2022-11-23T03:51:16.0349012Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115199 2022-11-23T03:51:16.0349440Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 115200 2022-11-23T03:51:16.0349848Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 115201 2022-11-23T03:51:16.0350432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0350862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0351396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0351841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0352393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0352879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0353412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0353854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0354401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0354805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0355434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0355890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0356440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:16.0357036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:16.0357604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:16.0358060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:16.0358492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:16.0358942Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:16.0359400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:51:16.0359862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:51:16.0360182Z ok (2.412s) 2022-11-23T03:51:16.0360329Z 2022-11-23T03:51:16.0360599Z ---------------------------------------------------------------------- 2022-11-23T03:51:16.0360932Z Ran 8 tests in 32.965s 2022-11-23T03:51:16.0361258Z 2022-11-23T03:51:16.0361419Z OK 2022-11-23T03:51:16.0361725Z 2022-11-23T03:51:16.0361851Z Generating XML reports... 2022-11-23T03:51:16.0362512Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_checkpoint/TEST-TestDistributedCheckpointing-20221123035042.xml 2022-11-23T03:51:16.0363360Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_checkpoint/TEST-TestDistributedFailure-20221123035042.xml 2022-11-23T03:51:16.0363731Z 2022-11-23T03:51:16.0364164Z ##[endgroup] 2022-11-23T03:51:16.0364803Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_checkpoint_h3d_1h7o) 2022-11-23T03:51:16.0365179Z 2022-11-23T03:51:16.4192131Z 2022-11-23T03:51:16.4192656Z real 0m41.026s 2022-11-23T03:51:16.4193177Z user 1m55.863s 2022-11-23T03:51:16.4193662Z sys 1m23.272s 2022-11-23T03:51:16.4194235Z + python test/run_test.py --verbose -i distributed/checkpoint/test_file_system_checkpoint 2022-11-23T03:51:18.7745672Z Ignoring disabled issues: [] 2022-11-23T03:51:18.8281819Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:51:18.8282413Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:51:18.8283590Z Selected tests: 2022-11-23T03:51:18.8283987Z distributed/checkpoint/test_file_system_checkpoint 2022-11-23T03:51:18.8310875Z Prioritized test from test file changes. 2022-11-23T03:51:18.8311212Z reordering tests for PR: 2022-11-23T03:51:18.8311512Z prioritized: [] 2022-11-23T03:51:18.8312048Z the rest: ['distributed/checkpoint/test_file_system_checkpoint'] 2022-11-23T03:51:18.8312289Z 2022-11-23T03:51:18.8312835Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:51:18.8313777Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:51:18.8319140Z parallel (file granularity) tests: 2022-11-23T03:51:18.8319423Z 2022-11-23T03:51:18.8319677Z serial (file granularity) tests: 2022-11-23T03:51:18.8319997Z distributed/checkpoint/test_file_system_checkpoint 2022-11-23T03:51:21.1188269Z Ignoring disabled issues: [] 2022-11-23T03:51:21.5563478Z Running distributed/checkpoint/test_file_system_checkpoint ... [2022-11-23 03:51:21.555793] 2022-11-23T03:51:21.5564700Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_file_system_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:51:21.556248] 2022-11-23T03:51:47.7875889Z 2022-11-23T03:51:47.7876656Z Expand the folded group to see the log file of distributed/checkpoint/test_file_system_checkpoint 2022-11-23T03:51:47.7877853Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_file_system_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_file_system_checkpoint_xwzxqn9d) 2022-11-23T03:51:47.7883943Z 2022-11-23T03:51:47.7884705Z Running tests... 2022-11-23T03:51:47.7885304Z ---------------------------------------------------------------------- 2022-11-23T03:51:47.7885989Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint 2022-11-23T03:51:47.7886529Z test_load_rowwise_to_colwise (__main__.TestDistributedReshardOnLoad) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:51:47.7887138Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115674 2022-11-23T03:51:47.7887507Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115675 2022-11-23T03:51:47.7888167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7888632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7889479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7889968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7890559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7891014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7891648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7892148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7892509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:47.7893012Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:47.7893494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:47.7894105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:47.7894775Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7895469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7895769Z ok (5.862s) 2022-11-23T03:51:47.7896328Z test_load_with_different_shard_plan (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115821 2022-11-23T03:51:47.7896832Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115822 2022-11-23T03:51:47.7897433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7897902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7898485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7898962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7900094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7900558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7901304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7901902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7902497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:47.7902889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:47.7903450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:47.7904339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:47.7905004Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7906158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7906966Z ok (4.717s) 2022-11-23T03:51:47.7907756Z test_save_load_bytes (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 115968 2022-11-23T03:51:47.7908315Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 115969 2022-11-23T03:51:47.7909040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7909553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7910123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7910822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7912068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7912807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7913369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7913842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7914287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:47.7914769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:47.7915243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:47.7915738Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:47.7916400Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7917074Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7917472Z ok (4.114s) 2022-11-23T03:51:47.7918016Z test_switch_between_sharded_tensor_to_tensor (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 116115 2022-11-23T03:51:47.7918636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 116116 2022-11-23T03:51:47.7919134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7919589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7920167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7920640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7921319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7921788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7922365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7922808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7923251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:47.7923747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:47.7924238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:47.7924696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:47.7925354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7926051Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7926449Z ok (4.916s) 2022-11-23T03:51:47.7926794Z test_read_write_only_tensor (__main__.TestDistributedStateDictSaveLoad) ... ok (0.056s) 2022-11-23T03:51:47.7927444Z test_read_write_shard_tensor (__main__.TestDistributedStateDictSaveLoadWithSharedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 116262 2022-11-23T03:51:47.7928129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 116263 2022-11-23T03:51:47.7928721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7929176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7929772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7930230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7930790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:51:47.7931242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:51:47.7931815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:51:47.7932266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:51:47.7932702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:51:47.7933182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:51:47.7933674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:51:47.7934153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:51:47.7934814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7935501Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T03:51:47.7935894Z ok (4.114s) 2022-11-23T03:51:47.7936028Z 2022-11-23T03:51:47.7936320Z ---------------------------------------------------------------------- 2022-11-23T03:51:47.7936640Z Ran 6 tests in 23.780s 2022-11-23T03:51:47.7936806Z 2022-11-23T03:51:47.7936904Z OK 2022-11-23T03:51:47.7937043Z 2022-11-23T03:51:47.7937148Z Generating XML reports... 2022-11-23T03:51:47.7937831Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20221123035123.xml 2022-11-23T03:51:47.7938832Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20221123035123.xml 2022-11-23T03:51:47.7939864Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20221123035123.xml 2022-11-23T03:51:47.7940357Z 2022-11-23T03:51:47.7940869Z ##[endgroup] 2022-11-23T03:51:47.7941559Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_file_system_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_file_system_checkpoint_xwzxqn9d) 2022-11-23T03:51:47.7941975Z 2022-11-23T03:51:48.1484033Z 2022-11-23T03:51:48.1484647Z real 0m31.729s 2022-11-23T03:51:48.1484972Z user 0m59.954s 2022-11-23T03:51:48.1485199Z sys 0m52.124s 2022-11-23T03:51:48.1485845Z + python test/run_test.py --verbose -i distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T03:51:50.5274071Z Ignoring disabled issues: [] 2022-11-23T03:51:50.5814168Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:51:50.5814748Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:51:50.5815116Z Selected tests: 2022-11-23T03:51:50.5815442Z distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T03:51:50.5841663Z Prioritized test from test file changes. 2022-11-23T03:51:50.5842386Z reordering tests for PR: 2022-11-23T03:51:50.5842671Z prioritized: [] 2022-11-23T03:51:50.5843246Z the rest: ['distributed/_shard/sharding_spec/test_sharding_spec'] 2022-11-23T03:51:50.5843488Z 2022-11-23T03:51:50.5844016Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:51:50.5844970Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:51:50.5848907Z parallel (file granularity) tests: 2022-11-23T03:51:50.5849198Z 2022-11-23T03:51:50.5849444Z serial (file granularity) tests: 2022-11-23T03:51:52.8978591Z distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T03:51:52.8979027Z Ignoring disabled issues: [] 2022-11-23T03:51:53.2840189Z Running distributed/_shard/sharding_spec/test_sharding_spec ... [2022-11-23 03:51:53.283378] 2022-11-23T03:51:53.2843438Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharding_spec/test_sharding_spec.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:51:53.283880] 2022-11-23T03:52:27.6277923Z 2022-11-23T03:52:27.6278758Z Expand the folded group to see the log file of distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T03:52:27.6279955Z ##[group]PRINTING LOG FILE of distributed/_shard/sharding_spec/test_sharding_spec (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_spec-test_sharding_spec_isn99ko_) 2022-11-23T03:52:27.6284328Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxrl93pnv 2022-11-23T03:52:27.6284956Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxrl93pnv/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6285274Z 2022-11-23T03:52:27.6285403Z Running tests... 2022-11-23T03:52:27.6285983Z ---------------------------------------------------------------------- 2022-11-23T03:52:27.6286619Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec 2022-11-23T03:52:27.6287134Z test_custom_sharding_spec (__main__.TestCustomShardingSpec) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:52:27.6287646Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 116621 2022-11-23T03:52:27.6289479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 116622 2022-11-23T03:52:27.6289930Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 116623 2022-11-23T03:52:27.6290705Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 116624 2022-11-23T03:52:27.6291392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6291861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6292449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6292923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6293515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6293974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6294566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6295050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6295642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6296081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6296663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6297292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6297891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6298338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6298916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6299387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6299848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphpounni6 2022-11-23T03:52:27.6300407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphpounni6/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6300948Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbldzwqn4 2022-11-23T03:52:27.6301493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbldzwqn4/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6302019Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4lvdh9wy 2022-11-23T03:52:27.6302572Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4lvdh9wy/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6303098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:52:27.6303594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:52:27.6304612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp10whed2p 2022-11-23T03:52:27.6305201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp10whed2p/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6305655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:52:27.6306181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:52:27.6306524Z ok (4.236s) 2022-11-23T03:52:27.6306820Z test_custom_sharding_spec_shard_tensor (__main__.TestCustomShardingSpec) 2022-11-23T03:52:27.6307294Z Test custom spec can be invoked from the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 116885 2022-11-23T03:52:27.6307781Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 116886 2022-11-23T03:52:27.6308231Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 116887 2022-11-23T03:52:27.6308779Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 116888 2022-11-23T03:52:27.6309425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6309859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6310436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6310918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6311479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6312026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6312564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6313036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6313592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6314044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6314620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6315183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6315744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6316248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6316829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6317309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6317771Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx38kn7ka 2022-11-23T03:52:27.6318337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx38kn7ka/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6318870Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppkriffrj 2022-11-23T03:52:27.6319500Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppkriffrj/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6319938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdc7astws 2022-11-23T03:52:27.6320546Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdc7astws/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6321007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5nu9_sr 2022-11-23T03:52:27.6321537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5nu9_sr/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6322026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:52:27.6322500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:52:27.6322974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:52:27.6323424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:52:27.6323833Z fi_getinfo: -61 2022-11-23T03:52:27.6324126Z fi_getinfo: -61 2022-11-23T03:52:27.6324384Z fi_getinfo: -61 2022-11-23T03:52:27.6324665Z fi_getinfo: -61 2022-11-23T03:52:27.6325061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:52:27.6325563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:52:27.6326036Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:52:27.6326763Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6327421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:52:27.6328075Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6328773Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6329468Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6329868Z ok (13.945s) 2022-11-23T03:52:27.6330195Z test_custom_sharding_spec_tensor_ctor (__main__.TestCustomShardingSpec) 2022-11-23T03:52:27.6330715Z Test sharded_tensor.ones(...) with the custom ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 117354 2022-11-23T03:52:27.6331232Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 117355 2022-11-23T03:52:27.6331684Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 117356 2022-11-23T03:52:27.6332118Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 117357 2022-11-23T03:52:27.6332733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6333272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6333837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6334315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6334901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6335357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6335909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6336388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6336972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6337495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6338069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6338541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6339123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:52:27.6339549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:52:27.6340122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:52:27.6340592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:52:27.6341059Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9e3_tsm_ 2022-11-23T03:52:27.6341580Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9e3_tsm_/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6342123Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl4damrw3 2022-11-23T03:52:27.6342660Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl4damrw3/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6343174Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpecomrurm 2022-11-23T03:52:27.6343712Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpecomrurm/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6344527Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi83im6s1 2022-11-23T03:52:27.6345070Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi83im6s1/_remote_module_non_scriptable.py 2022-11-23T03:52:27.6345557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:52:27.6346030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:52:27.6346499Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:52:27.6346950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:52:27.6347360Z fi_getinfo: -61 2022-11-23T03:52:27.6347646Z fi_getinfo: -61 2022-11-23T03:52:27.6347909Z fi_getinfo: -61 2022-11-23T03:52:27.6348196Z fi_getinfo: -61 2022-11-23T03:52:27.6348577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:52:27.6349081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:52:27.6349557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:52:27.6350048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:52:27.6350694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6351550Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6352272Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6353019Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:52:27.6353453Z ok (13.644s) 2022-11-23T03:52:27.6353783Z test_check_overlapping (__main__.TestShardingSpec) ... ok (0.003s) 2022-11-23T03:52:27.6354221Z test_chunked_sharding_spec (__main__.TestShardingSpec) ... ok (0.012s) 2022-11-23T03:52:27.6354658Z test_device_placement (__main__.TestShardingSpec) ... ok (0.007s) 2022-11-23T03:52:27.6355088Z test_enumerable_sharding_spec (__main__.TestShardingSpec) ... ok (0.032s) 2022-11-23T03:52:27.6355510Z test_get_chunk_sharding_params (__main__.TestShardingSpec) ... ok (0.002s) 2022-11-23T03:52:27.6356180Z test_get_chunked_dim_size (__main__.TestShardingSpec) ... ok (0.001s) 2022-11-23T03:52:27.6356588Z test_get_split_size (__main__.TestShardingSpec) ... ok (0.001s) 2022-11-23T03:52:27.6356997Z test_infer_sharding_spec_from_shards_metadata (__main__.TestShardingSpec) ... ok (0.010s) 2022-11-23T03:52:27.6357261Z 2022-11-23T03:52:27.6357545Z ---------------------------------------------------------------------- 2022-11-23T03:52:27.6357895Z Ran 11 tests in 31.894s 2022-11-23T03:52:27.6358071Z 2022-11-23T03:52:27.6358197Z OK 2022-11-23T03:52:27.6358285Z 2022-11-23T03:52:27.6358415Z Generating XML reports... 2022-11-23T03:52:27.6359072Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestCustomShardingSpec-20221123035155.xml 2022-11-23T03:52:27.6359901Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestShardingSpec-20221123035155.xml 2022-11-23T03:52:27.6360271Z 2022-11-23T03:52:27.6360769Z ##[endgroup] 2022-11-23T03:52:27.6361454Z FINISHED PRINTING LOG FILE of distributed/_shard/sharding_spec/test_sharding_spec (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_spec-test_sharding_spec_isn99ko_) 2022-11-23T03:52:27.6361857Z 2022-11-23T03:52:27.9808512Z 2022-11-23T03:52:27.9809068Z real 0m39.832s 2022-11-23T03:52:27.9809406Z user 1m51.613s 2022-11-23T03:52:27.9809678Z sys 1m31.770s 2022-11-23T03:52:27.9810563Z + python test/run_test.py --verbose -i distributed/_shard/sharding_plan/test_sharding_plan 2022-11-23T03:52:30.3600764Z Ignoring disabled issues: [] 2022-11-23T03:52:30.4140339Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:52:30.4140839Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:52:30.4141253Z Selected tests: 2022-11-23T03:52:30.4141664Z distributed/_shard/sharding_plan/test_sharding_plan 2022-11-23T03:52:30.4169830Z Prioritized test from test file changes. 2022-11-23T03:52:30.4170182Z reordering tests for PR: 2022-11-23T03:52:30.4170476Z prioritized: [] 2022-11-23T03:52:30.4171087Z the rest: ['distributed/_shard/sharding_plan/test_sharding_plan'] 2022-11-23T03:52:30.4171332Z 2022-11-23T03:52:30.4171854Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:52:30.4172811Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:52:30.4178975Z parallel (file granularity) tests: 2022-11-23T03:52:30.4179276Z 2022-11-23T03:52:30.4179519Z serial (file granularity) tests: 2022-11-23T03:52:30.4179857Z distributed/_shard/sharding_plan/test_sharding_plan 2022-11-23T03:52:32.7024719Z Ignoring disabled issues: [] 2022-11-23T03:52:33.1043556Z Running distributed/_shard/sharding_plan/test_sharding_plan ... [2022-11-23 03:52:33.103674] 2022-11-23T03:52:33.1044663Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharding_plan/test_sharding_plan.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:52:33.104145] 2022-11-23T03:53:00.2862594Z 2022-11-23T03:53:00.2863327Z Expand the folded group to see the log file of distributed/_shard/sharding_plan/test_sharding_plan 2022-11-23T03:53:00.2867468Z ##[group]PRINTING LOG FILE of distributed/_shard/sharding_plan/test_sharding_plan (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_plan-test_sharding_plan_wlb1tops) 2022-11-23T03:53:00.2867885Z 2022-11-23T03:53:00.2868004Z Running tests... 2022-11-23T03:53:00.2868636Z ---------------------------------------------------------------------- 2022-11-23T03:53:00.2869209Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan 2022-11-23T03:53:00.2869759Z test_custom_sharding_planner (__main__.TestShardingPlan) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:53:00.2870643Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118035 2022-11-23T03:53:00.2871195Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118036 2022-11-23T03:53:00.2871573Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118037 2022-11-23T03:53:00.2872035Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118038 2022-11-23T03:53:00.2872667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2873707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2874395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2874824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2875389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2875851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2876446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2877250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2877915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2878381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2878866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2879321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2879910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2880365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2880948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2881407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2881850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:00.2882330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:00.2882811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:00.2883303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:00.2883914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:00.2884415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:00.2884917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:00.2885422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:00.2886094Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2886796Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2887462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2888160Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2888557Z ok (6.159s) 2022-11-23T03:53:00.2889042Z test_reshard_to_ddp_sharding_plan (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118324 2022-11-23T03:53:00.2889618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118325 2022-11-23T03:53:00.2890075Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118326 2022-11-23T03:53:00.2890547Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118327 2022-11-23T03:53:00.2891101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2891500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2892106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2892593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2893162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2893613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2894188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2894731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2895307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2895766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2896345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2896798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2897470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2897907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2898483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2898959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2899417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:00.2899881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:00.2900374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:00.2900870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:00.2901639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:00.2902080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:00.2902572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:00.2903067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:00.2903715Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2904968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2905684Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2906388Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2907182Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2907747Z warnings.warn( 2022-11-23T03:53:00.2908510Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2909061Z warnings.warn( 2022-11-23T03:53:00.2909811Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2910342Z warnings.warn( 2022-11-23T03:53:00.2911087Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2911625Z warnings.warn( 2022-11-23T03:53:00.2911850Z ok (4.820s) 2022-11-23T03:53:00.2912298Z test_shard_module_sub_process_group (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118613 2022-11-23T03:53:00.2912933Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118614 2022-11-23T03:53:00.2913410Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118615 2022-11-23T03:53:00.2913842Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118616 2022-11-23T03:53:00.2914470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2914939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2915526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2915982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2916568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2917027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2917586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2918155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2918640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2919200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2919759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2920242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2920830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2921264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2921849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2922318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2922771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:00.2923230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:00.2923702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:00.2924208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:00.2924668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:00.2925220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:00.2925650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:00.2926143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:00.2926782Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2927483Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2928176Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2928859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2929366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T03:53:00.2929924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T03:53:00.2930425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T03:53:00.2931089Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:53:00.2931593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T03:53:00.2932249Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:53:00.2932927Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:53:00.2933606Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T03:53:00.2933971Z ok (4.418s) 2022-11-23T03:53:00.2934411Z test_sharding_plan_errors (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 118909 2022-11-23T03:53:00.2934945Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 118910 2022-11-23T03:53:00.2935373Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 118911 2022-11-23T03:53:00.2935895Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 118912 2022-11-23T03:53:00.2936592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2937051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2937700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2938082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2938673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2939129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2939681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2940151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2940731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2941159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2941730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2942197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2942778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2943203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2943781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2944624Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2945026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:00.2945517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:00.2946084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:00.2946514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:00.2946954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:00.2947512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:00.2948019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:00.2948514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:00.2949166Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2949953Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2950738Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2951298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2951700Z ok (4.318s) 2022-11-23T03:53:00.2952155Z test_sharding_plan_simple_megatron (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 119198 2022-11-23T03:53:00.2952704Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 119199 2022-11-23T03:53:00.2953141Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 119200 2022-11-23T03:53:00.2953594Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 119201 2022-11-23T03:53:00.2954295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2954758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2955314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2955788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2956480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2956807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2957386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2957855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2958437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2958864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2959542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2959908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2960468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:00.2960916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:00.2961568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:00.2961963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:00.2962386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:00.2962865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:00.2963339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:00.2963809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:00.2964278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:00.2964836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:00.2965374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:00.2965820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:00.2966485Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2967189Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2967880Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2968538Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:00.2969457Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2970010Z warnings.warn( 2022-11-23T03:53:00.2970773Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2971380Z warnings.warn( 2022-11-23T03:53:00.2972141Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2972688Z warnings.warn( 2022-11-23T03:53:00.2973531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:00.2974049Z warnings.warn( 2022-11-23T03:53:00.2974293Z ok (5.023s) 2022-11-23T03:53:00.2974451Z 2022-11-23T03:53:00.2974726Z ---------------------------------------------------------------------- 2022-11-23T03:53:00.2975070Z Ran 5 tests in 24.738s 2022-11-23T03:53:00.2975218Z 2022-11-23T03:53:00.2975318Z OK 2022-11-23T03:53:00.2975457Z 2022-11-23T03:53:00.2975588Z Generating XML reports... 2022-11-23T03:53:00.2976225Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20221123035235.xml 2022-11-23T03:53:00.2976596Z 2022-11-23T03:53:00.2976992Z ##[endgroup] 2022-11-23T03:53:00.2977674Z FINISHED PRINTING LOG FILE of distributed/_shard/sharding_plan/test_sharding_plan (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_plan-test_sharding_plan_wlb1tops) 2022-11-23T03:53:00.2978071Z 2022-11-23T03:53:00.6573852Z 2022-11-23T03:53:00.6574736Z real 0m32.676s 2022-11-23T03:53:00.6575331Z user 1m32.693s 2022-11-23T03:53:00.6575639Z sys 1m5.370s 2022-11-23T03:53:00.6576301Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T03:53:03.0351796Z Ignoring disabled issues: [] 2022-11-23T03:53:03.0899397Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:53:03.0900066Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:53:03.0900378Z Selected tests: 2022-11-23T03:53:03.0900718Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T03:53:03.0925853Z Prioritized test from test file changes. 2022-11-23T03:53:03.0926199Z reordering tests for PR: 2022-11-23T03:53:03.0926505Z prioritized: [] 2022-11-23T03:53:03.0927404Z the rest: ['distributed/_shard/sharded_tensor/test_megatron_prototype'] 2022-11-23T03:53:03.0927653Z 2022-11-23T03:53:03.0928212Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:53:03.0929203Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:53:03.0933546Z parallel (file granularity) tests: 2022-11-23T03:53:03.0933945Z 2022-11-23T03:53:03.0934192Z serial (file granularity) tests: 2022-11-23T03:53:03.0934565Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T03:53:05.4204168Z Ignoring disabled issues: [] 2022-11-23T03:53:05.8388475Z Running distributed/_shard/sharded_tensor/test_megatron_prototype ... [2022-11-23 03:53:05.838136] 2022-11-23T03:53:05.8389913Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_megatron_prototype.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:53:05.838675] 2022-11-23T03:53:16.2237450Z 2022-11-23T03:53:16.2238125Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T03:53:16.2239351Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_megatron_prototype (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_megatron_prototype_qm92crgt) 2022-11-23T03:53:16.2240313Z 2022-11-23T03:53:16.2240431Z Running tests... 2022-11-23T03:53:16.2240973Z ---------------------------------------------------------------------- 2022-11-23T03:53:16.2241582Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype 2022-11-23T03:53:16.2242146Z test_megatron_two_layer_prototype (__main__.TestShardedTensorMegatronLinear) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T03:53:16.2242677Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 119719 2022-11-23T03:53:16.2243129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 119720 2022-11-23T03:53:16.2243563Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 119721 2022-11-23T03:53:16.2244002Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 119722 2022-11-23T03:53:16.2244623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:16.2245077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:16.2245632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:16.2246100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:16.2246677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:16.2247124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:16.2247679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:16.2248146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:16.2248720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:16.2249147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:16.2249714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:16.2250172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:16.2250739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T03:53:16.2251276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T03:53:16.2251861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T03:53:16.2252317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T03:53:16.2252737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T03:53:16.2253217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T03:53:16.2253673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T03:53:16.2254157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T03:53:16.2254629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T03:53:16.2255114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T03:53:16.2255594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T03:53:16.2256076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T03:53:16.2256710Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:16.2257464Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:16.2258139Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:16.2258797Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T03:53:16.2259704Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:16.2260293Z warnings.warn( 2022-11-23T03:53:16.2261053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:16.2261592Z warnings.warn( 2022-11-23T03:53:16.2262322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:16.2262859Z warnings.warn( 2022-11-23T03:53:16.2263597Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T03:53:16.2264441Z warnings.warn( 2022-11-23T03:53:16.2265205Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T03:53:16.2265749Z warnings.warn( 2022-11-23T03:53:16.2266505Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T03:53:16.2267049Z warnings.warn( 2022-11-23T03:53:16.2267775Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T03:53:16.2268305Z warnings.warn( 2022-11-23T03:53:16.2269138Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T03:53:16.2269685Z warnings.warn( 2022-11-23T03:53:16.2269901Z ok (7.951s) 2022-11-23T03:53:16.2270046Z 2022-11-23T03:53:16.2270327Z ---------------------------------------------------------------------- 2022-11-23T03:53:16.2270657Z Ran 1 test in 7.952s 2022-11-23T03:53:16.2270818Z 2022-11-23T03:53:16.2270893Z OK 2022-11-23T03:53:16.2271024Z 2022-11-23T03:53:16.2271147Z Generating XML reports... 2022-11-23T03:53:16.2271844Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20221123035307.xml 2022-11-23T03:53:16.2272269Z 2022-11-23T03:53:16.2272569Z ##[endgroup] 2022-11-23T03:53:16.2273263Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_megatron_prototype (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_megatron_prototype_qm92crgt) 2022-11-23T03:53:16.2273673Z 2022-11-23T03:53:16.5841633Z 2022-11-23T03:53:16.5842225Z real 0m15.927s 2022-11-23T03:53:16.5842499Z user 0m34.733s 2022-11-23T03:53:16.5842736Z sys 0m21.660s 2022-11-23T03:53:16.5843302Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/test_sharded_tensor 2022-11-23T03:53:18.9316251Z Ignoring disabled issues: [] 2022-11-23T03:53:18.9861982Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T03:53:18.9862586Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T03:53:18.9862945Z Selected tests: 2022-11-23T03:53:18.9863154Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-11-23T03:53:18.9891913Z Prioritized test from test file changes. 2022-11-23T03:53:18.9892288Z reordering tests for PR: 2022-11-23T03:53:18.9892594Z prioritized: [] 2022-11-23T03:53:18.9893124Z the rest: ['distributed/_shard/sharded_tensor/test_sharded_tensor'] 2022-11-23T03:53:18.9893364Z 2022-11-23T03:53:18.9893910Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T03:53:18.9894780Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T03:53:18.9901549Z parallel (file granularity) tests: 2022-11-23T03:53:18.9901846Z 2022-11-23T03:53:18.9902109Z serial (file granularity) tests: 2022-11-23T03:53:18.9902450Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-11-23T03:53:21.2831197Z Ignoring disabled issues: [] 2022-11-23T03:53:21.7180710Z Running distributed/_shard/sharded_tensor/test_sharded_tensor ... [2022-11-23 03:53:21.717596] 2022-11-23T03:53:21.7182380Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 03:53:21.718037] 2022-11-23T04:05:13.1021997Z 2022-11-23T04:05:13.1025525Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/test_sharded_tensor 2022-11-23T04:05:13.1026604Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_sharded_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_sharded_tensor_ctwddu86) 2022-11-23T04:05:13.1027018Z 2022-11-23T04:05:13.1027132Z Running tests... 2022-11-23T04:05:13.1027659Z ---------------------------------------------------------------------- 2022-11-23T04:05:13.1033671Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor 2022-11-23T04:05:13.1035154Z test_empty (__main__.TestCreateTensorFromParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:05:13.1035944Z ok (1.707s) 2022-11-23T04:05:13.1036578Z test_local_tensor (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 120240 2022-11-23T04:05:13.1037098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 120241 2022-11-23T04:05:13.1037607Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 120242 2022-11-23T04:05:13.1038058Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 120243 2022-11-23T04:05:13.1038695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1039153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1039806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1040289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1040873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1041301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1041881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1042567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1043147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1043577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1044148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1044616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1045194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1045622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1046192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1046655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1047083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1047574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1048058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1048550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1049010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1049573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1050060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1050538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1051182Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1051880Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1052555Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1053293Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1053672Z ok (4.320s) 2022-11-23T04:05:13.1054102Z test_local_tensor_error (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 120525 2022-11-23T04:05:13.1055010Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 120526 2022-11-23T04:05:13.1055446Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 120527 2022-11-23T04:05:13.1055906Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 120528 2022-11-23T04:05:13.1056522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1056978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1057534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1058021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1058598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1059106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1059668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1060268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1060852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1061297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1061851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1062318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1062891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1063316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1064462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1064981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1065419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1065890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1066377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1066863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1067357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1067823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1068310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1068792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1069492Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1070187Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1070863Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1071634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1072008Z ok (4.318s) 2022-11-23T04:05:13.1072439Z test_collect_local_shard (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 120810 2022-11-23T04:05:13.1072959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 120811 2022-11-23T04:05:13.1073402Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 120812 2022-11-23T04:05:13.1073834Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 120813 2022-11-23T04:05:13.1074449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1074903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1075452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1075926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1076497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1076936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1077487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1078025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1078594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1079027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1079571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1080029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1080596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1081012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1081572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1082030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1082460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1082931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1083407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1083891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1084363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1084821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1085297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1085773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1086403Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1087082Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1087755Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1088479Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1088852Z ok (4.318s) 2022-11-23T04:05:13.1089272Z test_reshard_output (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121095 2022-11-23T04:05:13.1089785Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121096 2022-11-23T04:05:13.1090226Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 121097 2022-11-23T04:05:13.1090650Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 121098 2022-11-23T04:05:13.1091247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1091693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1092244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1092708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1093279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1093717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1094259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1094767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1095334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1095751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1096314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1096771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1097343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1097766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1098334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1098792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1099223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1099691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1100167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1100628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1101093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1101566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1102040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1102517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1103157Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1103839Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1104889Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1105686Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1106063Z ok (4.418s) 2022-11-23T04:05:13.1106508Z test_create_shard_with_no_placement (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121384 2022-11-23T04:05:13.1107047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121385 2022-11-23T04:05:13.1107479Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 121386 2022-11-23T04:05:13.1107922Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 121387 2022-11-23T04:05:13.1108532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1108977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1109533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1109999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1110574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1110995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1111558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1112109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1112682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1113100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1113662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1114124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1114693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1115112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1115669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1116127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1116540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1117007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1117462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1117925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1118295Z fi_getinfo: -61 2022-11-23T04:05:13.1118571Z fi_getinfo: -61 2022-11-23T04:05:13.1118840Z fi_getinfo: -61 2022-11-23T04:05:13.1119090Z fi_getinfo: -61 2022-11-23T04:05:13.1119462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1119954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1120423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1120908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1121552Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1122233Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1122946Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1123628Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1124017Z ok (14.044s) 2022-11-23T04:05:13.1124445Z test_shard_metadata_init (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 121853 2022-11-23T04:05:13.1124954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 121854 2022-11-23T04:05:13.1125397Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 121855 2022-11-23T04:05:13.1125837Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 121856 2022-11-23T04:05:13.1126422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1126868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1127440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1127905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1128460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1128958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1129526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1129983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1130536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1130975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1131542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1131986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1132555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1132989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1133555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1133994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1134431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1134902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1135354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1135820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1136198Z fi_getinfo: -61 2022-11-23T04:05:13.1136473Z fi_getinfo: -61 2022-11-23T04:05:13.1136724Z fi_getinfo: -61 2022-11-23T04:05:13.1136990Z fi_getinfo: -61 2022-11-23T04:05:13.1137369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1137843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1138326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1138807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1139435Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1140167Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1140852Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1141525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1141895Z ok (13.443s) 2022-11-23T04:05:13.1142322Z test_shard_parameter (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122322 2022-11-23T04:05:13.1142840Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122323 2022-11-23T04:05:13.1143282Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122324 2022-11-23T04:05:13.1143704Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122325 2022-11-23T04:05:13.1144672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1145123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1145674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1146138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1146799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1147240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1147790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1148251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1148823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1149261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1149809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1150267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1150836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1151254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1151816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1152272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1152707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1153179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1153658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1154140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1154592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1155067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1155544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1156018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1156648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1157383Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1158073Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1158743Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1159116Z ok (4.318s) 2022-11-23T04:05:13.1159549Z test_shard_parameter_errors (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122611 2022-11-23T04:05:13.1160078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122612 2022-11-23T04:05:13.1160500Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122613 2022-11-23T04:05:13.1160937Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122614 2022-11-23T04:05:13.1161541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1161986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1162541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1163005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1163631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1164066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1164612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1165068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1165636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1166053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1166614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1167069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1167636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1168052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1168617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1169108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1169528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1170018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1170502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1170978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1171431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1171912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1172386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1172858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1173494Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1174225Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1174910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1175580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1175947Z ok (4.318s) 2022-11-23T04:05:13.1176358Z test_shard_tensor (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 122896 2022-11-23T04:05:13.1176861Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 122897 2022-11-23T04:05:13.1177287Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 122898 2022-11-23T04:05:13.1177722Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 122899 2022-11-23T04:05:13.1178329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1178769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1179322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1179783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1180406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1180825Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1181393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1181853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1182422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1182842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1183404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1184089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1184675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1185107Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1185668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1186129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1186541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1187030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1187508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1187981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1188434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1188897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1189369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1189827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1190480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1191235Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1191925Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1192577Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1192962Z ok (4.317s) 2022-11-23T04:05:13.1193380Z test_shard_tensor_errors (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123185 2022-11-23T04:05:13.1193892Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123186 2022-11-23T04:05:13.1194317Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123187 2022-11-23T04:05:13.1194753Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123188 2022-11-23T04:05:13.1195360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1195787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1196354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1196809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1197447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1197867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1198432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1198887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1199459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1199875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1200438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1200890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1201444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1201883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1202447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1202904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1203319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1203809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1204290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1204746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1205219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1205756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1206231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1206687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1207333Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1208057Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1208741Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1209392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1209786Z ok (4.418s) 2022-11-23T04:05:13.1210216Z test_cleanup (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123470 2022-11-23T04:05:13.1210725Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123471 2022-11-23T04:05:13.1211171Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123472 2022-11-23T04:05:13.1211614Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123473 2022-11-23T04:05:13.1212220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1212652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1213221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1213684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1214313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1214735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1215299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1215757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1216314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1216750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1217313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1217772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1218324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1218763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1219324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1219761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1220189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1220658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1221115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1221559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1221935Z fi_getinfo: -61 2022-11-23T04:05:13.1222212Z fi_getinfo: -61 2022-11-23T04:05:13.1222461Z fi_getinfo: -61 2022-11-23T04:05:13.1222724Z fi_getinfo: -61 2022-11-23T04:05:13.1223097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1223570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1224366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1225102Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1225793Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1226305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1226941Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1227615Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1227999Z ok (14.042s) 2022-11-23T04:05:13.1228425Z test_complete_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 123939 2022-11-23T04:05:13.1228963Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 123940 2022-11-23T04:05:13.1229412Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 123941 2022-11-23T04:05:13.1229834Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 123942 2022-11-23T04:05:13.1230438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1230883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1231539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1231988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1232558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1232997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1233562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1234005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1234573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1235009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1235556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1236017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1236585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1237018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1237561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1238022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1238456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1238908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1239373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1239829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1240213Z fi_getinfo: -61 2022-11-23T04:05:13.1240468Z fi_getinfo: -61 2022-11-23T04:05:13.1240731Z fi_getinfo: -61 2022-11-23T04:05:13.1240996Z fi_getinfo: -61 2022-11-23T04:05:13.1241350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1241837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1242371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1243028Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1243692Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1244223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1244859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1245518Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1245902Z ok (14.046s) 2022-11-23T04:05:13.1246237Z test_create_sharded_tensor_like (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1246777Z Test tensor like methods, i.e. torch.zeros_like(...), torch.full_like, etc. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124412 2022-11-23T04:05:13.1247294Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124413 2022-11-23T04:05:13.1247736Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124414 2022-11-23T04:05:13.1248175Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124415 2022-11-23T04:05:13.1248831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1249261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1249827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1250292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1250852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1251290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1251855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1252313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1274769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1275348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1275961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1276436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1277023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1277450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1278018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1278481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1278931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1279383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1279839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1280309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1280679Z fi_getinfo: -61 2022-11-23T04:05:13.1280957Z fi_getinfo: -61 2022-11-23T04:05:13.1281225Z fi_getinfo: -61 2022-11-23T04:05:13.1281672Z fi_getinfo: -61 2022-11-23T04:05:13.1282071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1282572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1283074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1283555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1284217Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1284908Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1285584Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1286241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1286634Z ok (13.640s) 2022-11-23T04:05:13.1286978Z test_create_sharded_tensor_with_full (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1287448Z Test sharded_tensor.full(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 124881 2022-11-23T04:05:13.1288014Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 124882 2022-11-23T04:05:13.1288465Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 124883 2022-11-23T04:05:13.1288910Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 124884 2022-11-23T04:05:13.1289500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1289950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1290525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1290975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1291555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1291996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1292568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1293010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1293684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1294119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1294692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1295136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1295717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1296247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1296804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1297266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1297708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1298176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1298620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1299136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1299534Z fi_getinfo: -61 2022-11-23T04:05:13.1299788Z fi_getinfo: -61 2022-11-23T04:05:13.1300059Z fi_getinfo: -61 2022-11-23T04:05:13.1300330Z fi_getinfo: -61 2022-11-23T04:05:13.1300696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1301209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1301702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1302356Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1302867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1303513Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1304565Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1305241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1305745Z ok (13.744s) 2022-11-23T04:05:13.1306087Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1306580Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 125350 2022-11-23T04:05:13.1307056Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 125351 2022-11-23T04:05:13.1307517Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 125352 2022-11-23T04:05:13.1307974Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 125353 2022-11-23T04:05:13.1308593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1309025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1309596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1310069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1310627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1311067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1311639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1312098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1312653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1313098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1313658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1314118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1314674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1315111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1315669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1316110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1316612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1317090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1317557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1318001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1318385Z fi_getinfo: -61 2022-11-23T04:05:13.1318661Z fi_getinfo: -61 2022-11-23T04:05:13.1318916Z fi_getinfo: -61 2022-11-23T04:05:13.1319180Z fi_getinfo: -61 2022-11-23T04:05:13.1319561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1320035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1320525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1321182Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1321713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1322339Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1323011Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1323756Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1324136Z ok (14.045s) 2022-11-23T04:05:13.1324457Z test_create_sharded_tensor_with_rand (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1324961Z Test sharded_tensor.rand(...)/randn(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 125819 2022-11-23T04:05:13.1325468Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 125820 2022-11-23T04:05:13.1325898Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 125821 2022-11-23T04:05:13.1326340Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 125822 2022-11-23T04:05:13.1326948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1327397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1327953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1328418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1328989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1329407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1329985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1330452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1331031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1331459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1332013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1332457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1333027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1333475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1334106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1334568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1334982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1335449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1335922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1336399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1336768Z fi_getinfo: -61 2022-11-23T04:05:13.1337041Z fi_getinfo: -61 2022-11-23T04:05:13.1337309Z fi_getinfo: -61 2022-11-23T04:05:13.1337558Z fi_getinfo: -61 2022-11-23T04:05:13.1337930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1338429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1338905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1339555Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1340086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1340787Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1341445Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1342122Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1342505Z ok (14.046s) 2022-11-23T04:05:13.1342851Z test_create_sharded_tensor_with_zeros (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1343326Z Test sharded_tensor.zeros(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 126288 2022-11-23T04:05:13.1343814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 126289 2022-11-23T04:05:13.1344560Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 126290 2022-11-23T04:05:13.1344999Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 126291 2022-11-23T04:05:13.1345611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1346056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1346624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1347078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1347648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1348087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1348636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1349104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1349670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1350106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1350651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1351113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1351756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1352211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1352773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1353244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1353680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1354130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1354590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1355065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1355454Z fi_getinfo: -61 2022-11-23T04:05:13.1355710Z fi_getinfo: -61 2022-11-23T04:05:13.1355982Z fi_getinfo: -61 2022-11-23T04:05:13.1356248Z fi_getinfo: -61 2022-11-23T04:05:13.1356607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1357106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1357598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1358305Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1358822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1359529Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1360210Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1360874Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1361262Z ok (13.641s) 2022-11-23T04:05:13.1361564Z test_gather_even (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1362065Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 126757 2022-11-23T04:05:13.1362614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 126758 2022-11-23T04:05:13.1363065Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 126759 2022-11-23T04:05:13.1363506Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 126760 2022-11-23T04:05:13.1364093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1364543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1365115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1365581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1366131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1366571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1367128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1367571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1368131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1368563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1369209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1369669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1370255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1370704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1371281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1371727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1372177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1372656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1373113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1373581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1373974Z fi_getinfo: -61 2022-11-23T04:05:13.1374226Z fi_getinfo: -61 2022-11-23T04:05:13.1374496Z fi_getinfo: -61 2022-11-23T04:05:13.1374760Z fi_getinfo: -61 2022-11-23T04:05:13.1375120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1375694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1376190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1376841Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1377506Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1378034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1378677Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1379347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1379718Z ok (13.938s) 2022-11-23T04:05:13.1380029Z test_gather_uneven (__main__.TestShardedTensorChunked) 2022-11-23T04:05:13.1380548Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 127230 2022-11-23T04:05:13.1381078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 127231 2022-11-23T04:05:13.1381508Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 127232 2022-11-23T04:05:13.1381956Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 127233 2022-11-23T04:05:13.1382567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1382995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1383560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1384369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1384956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1385378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1385942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1386488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1387052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1387489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1388049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1388510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1389063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1389503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1390062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1390524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1390947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1391416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1391884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1392332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1392785Z fi_getinfo: -61 2022-11-23T04:05:13.1393059Z fi_getinfo: -61 2022-11-23T04:05:13.1393313Z fi_getinfo: -61 2022-11-23T04:05:13.1393576Z fi_getinfo: -61 2022-11-23T04:05:13.1393954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1394430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1394927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1395574Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1396253Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1396760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1397402Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1398068Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1398454Z ok (13.740s) 2022-11-23T04:05:13.1398891Z test_insufficient_sharding_dims (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 127703 2022-11-23T04:05:13.1399443Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 127704 2022-11-23T04:05:13.1399888Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 127705 2022-11-23T04:05:13.1400331Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 127706 2022-11-23T04:05:13.1400920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1401370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1401938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1402387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1402961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1403406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1404044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1404495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1405067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1405564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1406117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1406573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1407140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1407578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1408124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1408588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1409020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1409511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1410029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1410509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1410979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1411435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1411913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1412386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1413034Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1413696Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1414375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1415047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1415430Z ok (3.917s) 2022-11-23T04:05:13.1415855Z test_invalid_pg_rpc_ranks (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 127973 2022-11-23T04:05:13.1416396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 127974 2022-11-23T04:05:13.1416843Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 127975 2022-11-23T04:05:13.1417273Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 127976 2022-11-23T04:05:13.1417876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1418328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1418897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1419345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1419920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1420412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1420969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1421427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1421991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1422432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1422975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1423434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1424192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1424637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1425192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1425652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1426086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1426636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1427201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1427682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1428156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1428612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1429091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1429572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1430222Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1430885Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1431563Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1432232Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1432638Z fi_getinfo: -61 2022-11-23T04:05:13.1432909Z fi_getinfo: -61 2022-11-23T04:05:13.1433177Z fi_getinfo: -61 2022-11-23T04:05:13.1433425Z fi_getinfo: -61 2022-11-23T04:05:13.1433924Z [W tensorpipe_agent.cpp:725] RPC agent for worker0 encountered error when reading incoming request from worker3: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1434641Z [W tensorpipe_agent.cpp:725] RPC agent for worker2 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1435354Z [W tensorpipe_agent.cpp:725] RPC agent for worker1 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1435804Z ok (4.416s) 2022-11-23T04:05:13.1436228Z test_invalid_sharding (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128414 2022-11-23T04:05:13.1436759Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128415 2022-11-23T04:05:13.1437270Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128416 2022-11-23T04:05:13.1437709Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128417 2022-11-23T04:05:13.1438316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1438765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1439345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1439795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1440369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1440807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1441354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1441819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1442397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1442837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1443380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1443897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1444466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1444905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1445448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1445908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1446346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1446815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1447289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1447773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1448251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1448707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1449182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1449664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1450293Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1450976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1451647Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1452317Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1452721Z fi_getinfo: -61 2022-11-23T04:05:13.1452993Z fi_getinfo: -61 2022-11-23T04:05:13.1453264Z fi_getinfo: -61 2022-11-23T04:05:13.1453514Z fi_getinfo: -61 2022-11-23T04:05:13.1454053Z [W tensorpipe_agent.cpp:725] RPC agent for worker3 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1454768Z [W tensorpipe_agent.cpp:725] RPC agent for worker2 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1455479Z [W tensorpipe_agent.cpp:725] RPC agent for worker1 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259) 2022-11-23T04:05:13.1455916Z ok (13.840s) 2022-11-23T04:05:13.1456366Z test_load_state_dict_errors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 128870 2022-11-23T04:05:13.1456905Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 128871 2022-11-23T04:05:13.1457358Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 128872 2022-11-23T04:05:13.1457783Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 128873 2022-11-23T04:05:13.1458399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1458848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1459420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1459868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1460502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1460942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1461490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1461950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1462526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1462962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1463524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1464175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1464767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1465188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1465823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1466281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1466735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1467189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1467651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1468117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1468500Z fi_getinfo: -61 2022-11-23T04:05:13.1468760Z fi_getinfo: -61 2022-11-23T04:05:13.1469027Z fi_getinfo: -61 2022-11-23T04:05:13.1469347Z fi_getinfo: -61 2022-11-23T04:05:13.1469709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1470205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1470694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1471237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1471900Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1472585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1473257Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1473908Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1474436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1474921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1475413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1475883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1476523Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1477198Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1477925Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1478594Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1478984Z ok (13.945s) 2022-11-23T04:05:13.1479437Z test_multiple_local_shards (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 129329 2022-11-23T04:05:13.1479966Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 129330 2022-11-23T04:05:13.1480413Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 129331 2022-11-23T04:05:13.1480857Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 129332 2022-11-23T04:05:13.1481460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1481898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1482467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1482933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1483487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1483927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1484493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1484953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1485507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1485955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1486515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1486975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1487528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1487967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1488580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1489029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1489465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1489935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1490405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1490776Z fi_getinfo: -61 2022-11-23T04:05:13.1491134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1491514Z fi_getinfo: -61 2022-11-23T04:05:13.1491767Z fi_getinfo: -61 2022-11-23T04:05:13.1492033Z fi_getinfo: -61 2022-11-23T04:05:13.1492406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1492887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1493379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1493866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1494515Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1495282Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1495955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1496628Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1497010Z ok (14.241s) 2022-11-23T04:05:13.1497429Z test_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 129802 2022-11-23T04:05:13.1497955Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 129803 2022-11-23T04:05:13.1498402Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 129804 2022-11-23T04:05:13.1498827Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 129805 2022-11-23T04:05:13.1499436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1499882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1500452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1500899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1501477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1501918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1502468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1502930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1503502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1504269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1504844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1505306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1506000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1506434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1507000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1507460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1507901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1508354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1508810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1509267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1509648Z fi_getinfo: -61 2022-11-23T04:05:13.1509902Z fi_getinfo: -61 2022-11-23T04:05:13.1510176Z fi_getinfo: -61 2022-11-23T04:05:13.1510444Z fi_getinfo: -61 2022-11-23T04:05:13.1510802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1511297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1511788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1512324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1512973Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1513653Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1514324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1514830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1515317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1515802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1516441Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1516947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1517577Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1518241Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1518892Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1519552Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1519943Z ok (14.046s) 2022-11-23T04:05:13.1520387Z test_partial_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 130278 2022-11-23T04:05:13.1520909Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 130279 2022-11-23T04:05:13.1521357Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 130280 2022-11-23T04:05:13.1521800Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 130281 2022-11-23T04:05:13.1522403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1522835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1523459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1523930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1524484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1524928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1525490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1525947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1526499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1526941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1527514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1527973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1528521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1528959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1529578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1530022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1530459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1530925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1531393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1531837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1533049Z fi_getinfo: -61 2022-11-23T04:05:13.1533325Z fi_getinfo: -61 2022-11-23T04:05:13.1533574Z fi_getinfo: -61 2022-11-23T04:05:13.1533836Z fi_getinfo: -61 2022-11-23T04:05:13.1534208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1534688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1535181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1535833Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1536364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1536989Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1537663Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1538330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1538711Z ok (14.043s) 2022-11-23T04:05:13.1539147Z test_sharded_tensor_metadata (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 130751 2022-11-23T04:05:13.1539690Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 130752 2022-11-23T04:05:13.1540134Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 130753 2022-11-23T04:05:13.1540565Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 130754 2022-11-23T04:05:13.1541223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1541674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1542244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1542690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1543259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1543695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1544471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1544918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1545492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1545926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1546470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1546924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1547619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1548054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1548594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1549051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1549483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1549934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1550399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1550857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1551236Z fi_getinfo: -61 2022-11-23T04:05:13.1551495Z fi_getinfo: -61 2022-11-23T04:05:13.1551759Z fi_getinfo: -61 2022-11-23T04:05:13.1552025Z fi_getinfo: -61 2022-11-23T04:05:13.1552381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1552878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1553366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1554000Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1554525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1555163Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1555828Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1556486Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1556865Z ok (14.046s) 2022-11-23T04:05:13.1557306Z test_sharded_tensor_sizes (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 448 2022-11-23T04:05:13.1557832Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 449 2022-11-23T04:05:13.1558331Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 450 2022-11-23T04:05:13.1558773Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 451 2022-11-23T04:05:13.1559370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1559797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1560353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1560795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1561346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1561769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1562334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1562805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1563388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1563834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1564417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1564929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1565484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1565924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1566487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1566947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1567365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1567833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1568292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1568741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1569123Z fi_getinfo: -61 2022-11-23T04:05:13.1569442Z fi_getinfo: -61 2022-11-23T04:05:13.1569577Z fi_getinfo: -61 2022-11-23T04:05:13.1569710Z fi_getinfo: -61 2022-11-23T04:05:13.1569939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1570179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1570419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1570811Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1571048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1571439Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1571833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1572217Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1572327Z ok (13.745s) 2022-11-23T04:05:13.1572622Z test_sharding_columns (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 917 2022-11-23T04:05:13.1572889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 918 2022-11-23T04:05:13.1573105Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 919 2022-11-23T04:05:13.1573314Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 920 2022-11-23T04:05:13.1573687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1573866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1574240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1574429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1574772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1574947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1575315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1575502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1575856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1576075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1576441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1576626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1576989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1577142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1577510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1577695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1577920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1578162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1578390Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1578630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1578851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1579067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1579294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1579530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1579928Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1580319Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1580708Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1581095Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1581197Z ok (4.117s) 2022-11-23T04:05:13.1581495Z test_state_dict (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1189 2022-11-23T04:05:13.1581741Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1190 2022-11-23T04:05:13.1581961Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1191 2022-11-23T04:05:13.1582170Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1192 2022-11-23T04:05:13.1582538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1582713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1583086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1583273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1583633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1583805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1584377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1584569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1584934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1585182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1585550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1585734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1586090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1586262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1586610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1586795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1587022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1587240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1587466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1587685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1587825Z fi_getinfo: -61 2022-11-23T04:05:13.1587957Z fi_getinfo: -61 2022-11-23T04:05:13.1588074Z fi_getinfo: -61 2022-11-23T04:05:13.1588207Z fi_getinfo: -61 2022-11-23T04:05:13.1588449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1588697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1588935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1589326Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1589564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1589957Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1590346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1590714Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1590815Z ok (13.845s) 2022-11-23T04:05:13.1591189Z test_state_dict_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1658 2022-11-23T04:05:13.1591411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1659 2022-11-23T04:05:13.1591623Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1660 2022-11-23T04:05:13.1591838Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1661 2022-11-23T04:05:13.1592210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1592383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1592741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1592930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1593291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1593463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1593833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1594018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1594451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1594620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1594993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1595164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1595525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1595695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1596126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1596314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1596549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1596777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1597000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1597200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1597344Z fi_getinfo: -61 2022-11-23T04:05:13.1597479Z fi_getinfo: -61 2022-11-23T04:05:13.1597611Z fi_getinfo: -61 2022-11-23T04:05:13.1597745Z fi_getinfo: -61 2022-11-23T04:05:13.1597989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1598226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1598447Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1598844Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1599078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1599469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1599907Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1600195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1600435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1600670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1601060Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1601290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1601653Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1602043Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1602429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1602808Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1602909Z ok (14.146s) 2022-11-23T04:05:13.1603236Z test_state_dict_no_sharded_tensors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2136 2022-11-23T04:05:13.1603500Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2137 2022-11-23T04:05:13.1603713Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2138 2022-11-23T04:05:13.1603922Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2139 2022-11-23T04:05:13.1604276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1604455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1604831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1605023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1605384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1605609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1605982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1606168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1606509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1606678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1607054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1607240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1607597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1607770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1608138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1608325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1608553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1608761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1609033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1609262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1609404Z fi_getinfo: -61 2022-11-23T04:05:13.1609538Z fi_getinfo: -61 2022-11-23T04:05:13.1609670Z fi_getinfo: -61 2022-11-23T04:05:13.1609802Z fi_getinfo: -61 2022-11-23T04:05:13.1610028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1610275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1610513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1610905Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1611139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1611525Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1611916Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1612298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1612447Z ok (13.845s) 2022-11-23T04:05:13.1612737Z test_custom_op (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2605 2022-11-23T04:05:13.1612954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2606 2022-11-23T04:05:13.1613164Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2607 2022-11-23T04:05:13.1613375Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2608 2022-11-23T04:05:13.1613746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1613919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1614293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1614481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1614843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1615000Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1615370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1615557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1615917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1616087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1616458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1616642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1616996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1617148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1617515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1617703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1617928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1618199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1618428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1618650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1618791Z fi_getinfo: -61 2022-11-23T04:05:13.1618914Z fi_getinfo: -61 2022-11-23T04:05:13.1619047Z fi_getinfo: -61 2022-11-23T04:05:13.1619179Z fi_getinfo: -61 2022-11-23T04:05:13.1619421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1619660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1619898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1620296Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1620528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1620897Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1621284Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1621718Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1621818Z ok (13.644s) 2022-11-23T04:05:13.1622131Z test_custom_op_errors (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3074 2022-11-23T04:05:13.1622344Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3075 2022-11-23T04:05:13.1622560Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3076 2022-11-23T04:05:13.1622771Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3077 2022-11-23T04:05:13.1623135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1623293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1623668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1624036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1624416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1624593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1624970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1625157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1625518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1625672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1626043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1626231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1626590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1626760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1627125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1627380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1627614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1627839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1628042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1628269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1628411Z fi_getinfo: -61 2022-11-23T04:05:13.1628544Z fi_getinfo: -61 2022-11-23T04:05:13.1628676Z fi_getinfo: -61 2022-11-23T04:05:13.1628810Z fi_getinfo: -61 2022-11-23T04:05:13.1629053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1629275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1629516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1629747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1630142Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1630534Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1630992Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1631376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1631475Z ok (13.741s) 2022-11-23T04:05:13.1631798Z test_custom_op_override (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3543 2022-11-23T04:05:13.1631997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3544 2022-11-23T04:05:13.1632211Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3545 2022-11-23T04:05:13.1632421Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3546 2022-11-23T04:05:13.1632790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1632967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1633341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1633530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1633889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1634063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1634417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1634603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1634959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1635131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1635497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1635681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1636042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1636212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1636603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1636793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1637021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1637246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1637467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1637689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1637832Z fi_getinfo: -61 2022-11-23T04:05:13.1637966Z fi_getinfo: -61 2022-11-23T04:05:13.1638083Z fi_getinfo: -61 2022-11-23T04:05:13.1638217Z fi_getinfo: -61 2022-11-23T04:05:13.1638464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1638704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1638938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1639173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1639618Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1640007Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1640374Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1640766Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1640877Z ok (13.941s) 2022-11-23T04:05:13.1641099Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorEnumerable) 2022-11-23T04:05:13.1641362Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4016 2022-11-23T04:05:13.1641582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4017 2022-11-23T04:05:13.1641803Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4018 2022-11-23T04:05:13.1642016Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4019 2022-11-23T04:05:13.1642388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1642544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1642922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1643121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1643483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1643659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1644032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1644221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1644576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1644728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1645094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1645327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1645701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1645872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1646237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1646424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1646650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1646877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1647079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1647305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1647449Z fi_getinfo: -61 2022-11-23T04:05:13.1647584Z fi_getinfo: -61 2022-11-23T04:05:13.1647717Z fi_getinfo: -61 2022-11-23T04:05:13.1647848Z fi_getinfo: -61 2022-11-23T04:05:13.1648091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1648314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1648604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1649002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1649396Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1649630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1650019Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1650409Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1650511Z ok (13.845s) 2022-11-23T04:05:13.1650699Z test_gather_even (__main__.TestShardedTensorEnumerable) 2022-11-23T04:05:13.1650991Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4485 2022-11-23T04:05:13.1651206Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4486 2022-11-23T04:05:13.1651419Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4487 2022-11-23T04:05:13.1651629Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4488 2022-11-23T04:05:13.1652002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1652178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1652543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1652713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1653071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1653262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1653637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1653823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1654182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1654402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1654781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1654969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1655326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1655485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1655849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1656035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1656262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1656493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1656711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1656935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1657077Z fi_getinfo: -61 2022-11-23T04:05:13.1657195Z fi_getinfo: -61 2022-11-23T04:05:13.1657328Z fi_getinfo: -61 2022-11-23T04:05:13.1657550Z fi_getinfo: -61 2022-11-23T04:05:13.1657794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1658034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1658271Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1658505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1658902Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1659340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1659733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1660119Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1660223Z ok (14.042s) 2022-11-23T04:05:13.1660415Z test_gather_uneven (__main__.TestShardedTensorEnumerable) 2022-11-23T04:05:13.1660727Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4958 2022-11-23T04:05:13.1660941Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4959 2022-11-23T04:05:13.1661154Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4960 2022-11-23T04:05:13.1661365Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4961 2022-11-23T04:05:13.1661715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1661887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1662268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1662459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1662825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1662996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1663368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1663609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1664142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1664323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1664699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1664892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1665250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1665421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1665786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1665974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1666199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1666402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1666625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1666923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1667066Z fi_getinfo: -61 2022-11-23T04:05:13.1667199Z fi_getinfo: -61 2022-11-23T04:05:13.1667332Z fi_getinfo: -61 2022-11-23T04:05:13.1667522Z fi_getinfo: -61 2022-11-23T04:05:13.1667751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1667991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1668231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1668625Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1668859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1669249Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1669691Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1670078Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1670178Z ok (13.744s) 2022-11-23T04:05:13.1670474Z test_grid_sharding (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5431 2022-11-23T04:05:13.1670690Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5432 2022-11-23T04:05:13.1670903Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5433 2022-11-23T04:05:13.1671112Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5434 2022-11-23T04:05:13.1671483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1671659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1672033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1672224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1672570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1672804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1673185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1673373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1673729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1673902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1674266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1674451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1674814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1674968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1675335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1675519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1675745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1675971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1676245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1676475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1676615Z fi_getinfo: -61 2022-11-23T04:05:13.1676733Z fi_getinfo: -61 2022-11-23T04:05:13.1676866Z fi_getinfo: -61 2022-11-23T04:05:13.1676999Z fi_getinfo: -61 2022-11-23T04:05:13.1677242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1677486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1677721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1677957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1678354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1678731Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1679121Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1679507Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1679612Z ok (13.944s) 2022-11-23T04:05:13.1679934Z test_multiple_local_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5904 2022-11-23T04:05:13.1680147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5905 2022-11-23T04:05:13.1680359Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5906 2022-11-23T04:05:13.1680573Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5907 2022-11-23T04:05:13.1680940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1681098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1681470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1681662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1682071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1682249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1682619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1682810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1683169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1683322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1683686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1683870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1684235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1684405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1684765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1684950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1685233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1685457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1685657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1685878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1686019Z fi_getinfo: -61 2022-11-23T04:05:13.1686152Z fi_getinfo: -61 2022-11-23T04:05:13.1686289Z fi_getinfo: -61 2022-11-23T04:05:13.1686424Z fi_getinfo: -61 2022-11-23T04:05:13.1686665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1686889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1687127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1687522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1687909Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1688143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1688526Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1688910Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1689009Z ok (13.846s) 2022-11-23T04:05:13.1689317Z test_new_group (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6377 2022-11-23T04:05:13.1689519Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6378 2022-11-23T04:05:13.1689733Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6379 2022-11-23T04:05:13.1689945Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6380 2022-11-23T04:05:13.1690310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1690483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1690906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1691105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1691465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1691637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1691993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1692186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1692545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1692721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1693097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1693288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1693659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1693833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1694181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1694420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1694654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1694885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1695108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1695254Z fi_getinfo: -61 2022-11-23T04:05:13.1695477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1695619Z fi_getinfo: -61 2022-11-23T04:05:13.1695736Z fi_getinfo: -61 2022-11-23T04:05:13.1695874Z fi_getinfo: -61 2022-11-23T04:05:13.1696120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1696365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1696610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1696847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1697244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1697644Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1698017Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1698408Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1698648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1698889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1699128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1699358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1699748Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1700219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1700618Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1701002Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1701089Z ok (13.946s) 2022-11-23T04:05:13.1701411Z test_partial_world_size (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6853 2022-11-23T04:05:13.1701627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6854 2022-11-23T04:05:13.1701842Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6855 2022-11-23T04:05:13.1702055Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6856 2022-11-23T04:05:13.1702430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1702611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1702991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1703165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1703581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1703761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1704325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1704523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1704888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1705066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1705436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1705663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1706019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1706193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1706561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1706750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1706984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1707220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1707444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1707671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1707794Z fi_getinfo: -61 2022-11-23T04:05:13.1707941Z fi_getinfo: -61 2022-11-23T04:05:13.1708078Z fi_getinfo: -61 2022-11-23T04:05:13.1708214Z fi_getinfo: -61 2022-11-23T04:05:13.1708463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1708708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1708954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1709427Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1709655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1710060Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1710453Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1710844Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1710949Z ok (14.044s) 2022-11-23T04:05:13.1711278Z test_sharded_tensor_device (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7326 2022-11-23T04:05:13.1711495Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7327 2022-11-23T04:05:13.1711715Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7328 2022-11-23T04:05:13.1711908Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7329 2022-11-23T04:05:13.1712282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1712458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1712902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1713097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1713461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1713634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1714009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1714203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1714545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1714719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1715085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1715278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1715637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1715814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1716191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1716385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1716617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1716826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1717050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1717280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1717424Z fi_getinfo: -61 2022-11-23T04:05:13.1717562Z fi_getinfo: -61 2022-11-23T04:05:13.1717697Z fi_getinfo: -61 2022-11-23T04:05:13.1717834Z fi_getinfo: -61 2022-11-23T04:05:13.1718060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1718308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1718598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1719002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1719396Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1719637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1720023Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1720412Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1720518Z ok (13.641s) 2022-11-23T04:05:13.1720831Z test_sharded_tensor_metadata (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7795 2022-11-23T04:05:13.1721051Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7796 2022-11-23T04:05:13.1721267Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7797 2022-11-23T04:05:13.1721480Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7798 2022-11-23T04:05:13.1721854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1722096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1722474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1722668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1723011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1723191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1723567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1723759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1724121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1724304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1724674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1724864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1725229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1725388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1725759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1725949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1726178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1726410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1726635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1726863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1727008Z fi_getinfo: -61 2022-11-23T04:05:13.1727127Z fi_getinfo: -61 2022-11-23T04:05:13.1727264Z fi_getinfo: -61 2022-11-23T04:05:13.1727402Z fi_getinfo: -61 2022-11-23T04:05:13.1727696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1727948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1728193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1728589Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1728987Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1729204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1729593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1729985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1730090Z ok (14.044s) 2022-11-23T04:05:13.1730418Z test_sharded_tensor_to_cpu (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8264 2022-11-23T04:05:13.1730637Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8265 2022-11-23T04:05:13.1730855Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8266 2022-11-23T04:05:13.1731116Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8267 2022-11-23T04:05:13.1731486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1731643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1732017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1732211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1732567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1732737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1733107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1733300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1733652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1733805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1734171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1734358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1734724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1734893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1735257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1735441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1735671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1735896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1736104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1736327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1736468Z fi_getinfo: -61 2022-11-23T04:05:13.1736665Z fi_getinfo: -61 2022-11-23T04:05:13.1736807Z fi_getinfo: -61 2022-11-23T04:05:13.1736944Z fi_getinfo: -61 2022-11-23T04:05:13.1737189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1737411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1737649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1738047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1738284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1738707Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1739099Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1739488Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1739724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1739957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1740225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1740459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1740844Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1741232Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1741624Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1742010Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1742109Z ok (13.846s) 2022-11-23T04:05:13.1742435Z test_sharded_tensor_to_cuda (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8745 2022-11-23T04:05:13.1742654Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8746 2022-11-23T04:05:13.1742846Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8747 2022-11-23T04:05:13.1743057Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8748 2022-11-23T04:05:13.1743427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1743605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1744163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1744361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1744727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1744904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1745274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1745448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1745802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1746039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1746419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1746604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1746967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1747142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1747506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1747675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1747903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1748127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1748347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1748570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1748711Z fi_getinfo: -61 2022-11-23T04:05:13.1748846Z fi_getinfo: -61 2022-11-23T04:05:13.1748980Z fi_getinfo: -61 2022-11-23T04:05:13.1749096Z fi_getinfo: -61 2022-11-23T04:05:13.1749403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1749645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1749882Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1750276Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1750516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1750907Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1751294Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1751677Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1751898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1752134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1752370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1752599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1752990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1753379Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1753759Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1754140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1754241Z ok (13.642s) 2022-11-23T04:05:13.1754548Z test_sharded_tensor_to_test (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9226 2022-11-23T04:05:13.1754762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9227 2022-11-23T04:05:13.1755019Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9228 2022-11-23T04:05:13.1755237Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9229 2022-11-23T04:05:13.1755608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1755781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1756158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1756351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1756694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1756866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1757238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1757430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1757787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1757956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1758322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1758559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1758924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1759078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1759444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1759632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1759860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1760083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1760301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1760527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1760668Z fi_getinfo: -61 2022-11-23T04:05:13.1760786Z fi_getinfo: -61 2022-11-23T04:05:13.1760920Z fi_getinfo: -61 2022-11-23T04:05:13.1761052Z fi_getinfo: -61 2022-11-23T04:05:13.1761295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1761535Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1761777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1762169Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1762405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1762779Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1763169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1763553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1764177Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py:669: UserWarning: ShardedTensor.to only move tensor to its current deviceIf you want to put to different device, use `reshard` instead. 2022-11-23T04:05:13.1764384Z warnings.warn("ShardedTensor.to only move tensor to its current device" 2022-11-23T04:05:13.1764953Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py:669: UserWarning: ShardedTensor.to only move tensor to its current deviceIf you want to put to different device, use `reshard` instead. 2022-11-23T04:05:13.1765152Z warnings.warn("ShardedTensor.to only move tensor to its current device" 2022-11-23T04:05:13.1765708Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py:669: UserWarning: ShardedTensor.to only move tensor to its current deviceIf you want to put to different device, use `reshard` instead. 2022-11-23T04:05:13.1765905Z warnings.warn("ShardedTensor.to only move tensor to its current device" 2022-11-23T04:05:13.1766473Z /opt/conda/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py:669: UserWarning: ShardedTensor.to only move tensor to its current deviceIf you want to put to different device, use `reshard` instead. 2022-11-23T04:05:13.1766670Z warnings.warn("ShardedTensor.to only move tensor to its current device" 2022-11-23T04:05:13.1766894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1767136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1767423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1767654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1768051Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1768438Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1768828Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1769212Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1769322Z ok (14.245s) 2022-11-23T04:05:13.1769646Z test_uneven_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9707 2022-11-23T04:05:13.1769869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9708 2022-11-23T04:05:13.1770080Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9709 2022-11-23T04:05:13.1770289Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9710 2022-11-23T04:05:13.1770661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1770838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1771213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1771405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1771749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1771923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1772292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1772479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1772835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1773004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1773413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1773602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1773972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1774130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1774493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1774678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1774905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1775148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1775377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1775619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1775841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1776063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1776338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1776573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1776969Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1777363Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1777747Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1778136Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1778236Z ok (4.119s) 2022-11-23T04:05:13.1778549Z test_with_rpc_names (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9979 2022-11-23T04:05:13.1778749Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9980 2022-11-23T04:05:13.1778962Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9981 2022-11-23T04:05:13.1779172Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9982 2022-11-23T04:05:13.1779540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1779716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1780094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1780281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1780637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1780814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1781162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1781350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1781760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1781930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1782338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1782529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1782886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1783061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1783408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1783594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1783820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1784224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1784458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1784678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1784824Z fi_getinfo: -61 2022-11-23T04:05:13.1784959Z fi_getinfo: -61 2022-11-23T04:05:13.1785076Z fi_getinfo: -61 2022-11-23T04:05:13.1785210Z fi_getinfo: -61 2022-11-23T04:05:13.1785454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1785783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1786022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1786416Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1786649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1787043Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1787432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1787797Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1787901Z ok (13.946s) 2022-11-23T04:05:13.1788834Z test_init_from_local_shards (__main__.TestShardedTensorFromLocalShards) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78068 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-11-23T04:05:13.1789206Z test_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10452 2022-11-23T04:05:13.1789423Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10453 2022-11-23T04:05:13.1789636Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10454 2022-11-23T04:05:13.1789848Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10455 2022-11-23T04:05:13.1790221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1790395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1790771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1790944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1791366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1791550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1791921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1792106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1792467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1792638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1793010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1793178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1793537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1793705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1794073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1794260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1794487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1794768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1794993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1795218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1795343Z fi_getinfo: -61 2022-11-23T04:05:13.1795477Z fi_getinfo: -61 2022-11-23T04:05:13.1795609Z fi_getinfo: -61 2022-11-23T04:05:13.1795743Z fi_getinfo: -61 2022-11-23T04:05:13.1795989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1796230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1796469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1796848Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1797089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1797475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1797863Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1798246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1798347Z ok (13.842s) 2022-11-23T04:05:13.1798731Z test_init_from_local_shards_and_global_metadata_invalid_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10925 2022-11-23T04:05:13.1798947Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10926 2022-11-23T04:05:13.1799164Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10927 2022-11-23T04:05:13.1799359Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10928 2022-11-23T04:05:13.1799728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1799899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1800309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1800487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1800862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1801051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1801426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1801613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1801960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1802134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1802500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1802688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1803050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1803220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1803585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1803821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1804030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1804257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1804476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1804701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1804846Z fi_getinfo: -61 2022-11-23T04:05:13.1804981Z fi_getinfo: -61 2022-11-23T04:05:13.1805113Z fi_getinfo: -61 2022-11-23T04:05:13.1805245Z fi_getinfo: -61 2022-11-23T04:05:13.1805472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1805762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1806009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1806242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1806636Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1807029Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1807421Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1807811Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1807895Z ok (13.947s) 2022-11-23T04:05:13.1808260Z test_init_from_local_shards_invalid_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11394 2022-11-23T04:05:13.1808482Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11395 2022-11-23T04:05:13.1808695Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11396 2022-11-23T04:05:13.1808907Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11397 2022-11-23T04:05:13.1809321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1809499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1809878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1810065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1810408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1810584Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1810955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1811142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1811498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1811669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1812036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1812220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1812584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1812790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1813159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1813344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1813569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1813797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1814014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1814235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1814374Z fi_getinfo: -61 2022-11-23T04:05:13.1814492Z fi_getinfo: -61 2022-11-23T04:05:13.1814626Z fi_getinfo: -61 2022-11-23T04:05:13.1814763Z fi_getinfo: -61 2022-11-23T04:05:13.1815007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1815249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1815487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1815883Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1816124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1816503Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1816890Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1817278Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1817378Z ok (13.544s) 2022-11-23T04:05:13.1817738Z test_init_from_local_shards_invalid_pin_memory (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11863 2022-11-23T04:05:13.1817956Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11864 2022-11-23T04:05:13.1818170Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11865 2022-11-23T04:05:13.1818424Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11866 2022-11-23T04:05:13.1818786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1818961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1819339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1819533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1819891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1820061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1820426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1820618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1820975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1821127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1821499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1821765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1822123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1822297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1822670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1822858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1823084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1823312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1823514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1823736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1824162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1824411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1824650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1824881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1825288Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1825682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1826063Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1826438Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1826539Z ok (4.318s) 2022-11-23T04:05:13.1826915Z test_init_from_local_shards_invalid_property_cross_ranks (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12143 2022-11-23T04:05:13.1827131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12144 2022-11-23T04:05:13.1827415Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12145 2022-11-23T04:05:13.1827638Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12146 2022-11-23T04:05:13.1828011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1828185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1828538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1828715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1829088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1829278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1829654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1829841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1830202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1830375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1830742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1830978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1831338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1831512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1831878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1832065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1832293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1832518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1832741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1832946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1833089Z fi_getinfo: -61 2022-11-23T04:05:13.1833224Z fi_getinfo: -61 2022-11-23T04:05:13.1833357Z fi_getinfo: -61 2022-11-23T04:05:13.1833490Z fi_getinfo: -61 2022-11-23T04:05:13.1833734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1833975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1834216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1834433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1834833Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1835224Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1835609Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1835998Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1836097Z ok (13.843s) 2022-11-23T04:05:13.1836529Z test_init_from_local_shards_invalid_shards_gaps (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12612 2022-11-23T04:05:13.1836753Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12613 2022-11-23T04:05:13.1836966Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12614 2022-11-23T04:05:13.1837163Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12615 2022-11-23T04:05:13.1837539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1837712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1838086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1838274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1838631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1838805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1839174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1839343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1839709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1839975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1840341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1840528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1840884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1841097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1841460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1841642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1841855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1842084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1842307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1842523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1842664Z fi_getinfo: -61 2022-11-23T04:05:13.1842799Z fi_getinfo: -61 2022-11-23T04:05:13.1842932Z fi_getinfo: -61 2022-11-23T04:05:13.1843050Z fi_getinfo: -61 2022-11-23T04:05:13.1843297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1843536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1843774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1844168Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1844405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1844795Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1845183Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1845614Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1845701Z ok (13.844s) 2022-11-23T04:05:13.1846074Z test_init_from_local_shards_invalid_shards_overlap (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13081 2022-11-23T04:05:13.1846292Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13082 2022-11-23T04:05:13.1846508Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13083 2022-11-23T04:05:13.1846719Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13084 2022-11-23T04:05:13.1847091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1847264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1847642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1847832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1848178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1848350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1848716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1848965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1849329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1849499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1849860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1850046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1850389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1850563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1850927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1851115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1851341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1851566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1851790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1852006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1852149Z fi_getinfo: -61 2022-11-23T04:05:13.1852271Z fi_getinfo: -61 2022-11-23T04:05:13.1852404Z fi_getinfo: -61 2022-11-23T04:05:13.1852539Z fi_getinfo: -61 2022-11-23T04:05:13.1852782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1853021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1853263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1853656Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1853873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1854263Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1854698Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1855095Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1855194Z ok (13.643s) 2022-11-23T04:05:13.1855545Z test_init_from_local_shards_new_group (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13550 2022-11-23T04:05:13.1855822Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13551 2022-11-23T04:05:13.1856037Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13552 2022-11-23T04:05:13.1856248Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13553 2022-11-23T04:05:13.1856598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1856779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1857156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1857345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1857703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1857925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1858296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1858481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1858836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1858994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1859370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1859560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1859913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1860089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1860454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1860638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1860864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1861073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1861300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1861519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1861660Z fi_getinfo: -61 2022-11-23T04:05:13.1861794Z fi_getinfo: -61 2022-11-23T04:05:13.1861926Z fi_getinfo: -61 2022-11-23T04:05:13.1862059Z fi_getinfo: -61 2022-11-23T04:05:13.1862283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1862528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1862764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1863155Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1863387Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1864142Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1864565Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1864948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1865186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:05:13.1865424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:05:13.1865645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:05:13.1865882Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:05:13.1866271Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1866655Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1867033Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1867480Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:05:13.1867580Z ok (13.846s) 2022-11-23T04:05:13.1867910Z test_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14032 2022-11-23T04:05:13.1868127Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14033 2022-11-23T04:05:13.1868325Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14034 2022-11-23T04:05:13.1868541Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14035 2022-11-23T04:05:13.1868912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1869084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1869509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1869704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1870066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1870238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1870588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1870776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1871132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1871303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1871676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1871861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1872214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1872383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1872751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1872924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1873199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1873453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1873673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1873907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1874133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1874369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1874591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1874805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1875206Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1875597Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1875983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1876418Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1876518Z ok (4.319s) 2022-11-23T04:05:13.1876889Z test_st_base_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14317 2022-11-23T04:05:13.1877104Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14318 2022-11-23T04:05:13.1877320Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14319 2022-11-23T04:05:13.1877516Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14320 2022-11-23T04:05:13.1877882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1878054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1878434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1878622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1878980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1879148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1879515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1879709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1880047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1880216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1880580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1880768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1881130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1881297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1881661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1881891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1882107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1882333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1882550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1882776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1882875Z ok (8.531s) 2022-11-23T04:05:13.1883214Z test_init_from_local_tensor (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14597 2022-11-23T04:05:13.1883428Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14598 2022-11-23T04:05:13.1883639Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14599 2022-11-23T04:05:13.1883853Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14600 2022-11-23T04:05:13.1884210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1884384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1884758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1884996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1885354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1885523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1885889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1886080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1886420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1886592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1886952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1887126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1887496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1887682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1888049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1888231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1888458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1888667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1888883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1889100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1889245Z fi_getinfo: -61 2022-11-23T04:05:13.1889378Z fi_getinfo: -61 2022-11-23T04:05:13.1889513Z fi_getinfo: -61 2022-11-23T04:05:13.1889644Z fi_getinfo: -61 2022-11-23T04:05:13.1889871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1890112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1890548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1890789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1891025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1891415Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1891812Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1892201Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1892303Z ok (14.046s) 2022-11-23T04:05:13.1892632Z test_init_from_local_tensor_errors (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15070 2022-11-23T04:05:13.1892851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15071 2022-11-23T04:05:13.1893064Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15072 2022-11-23T04:05:13.1893275Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15073 2022-11-23T04:05:13.1893640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1893859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1894239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1894426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1894782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1894935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1895306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1895495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1895974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1896152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1896524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1896709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1897063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:13.1897217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:13.1897585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:13.1897770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:13.1897997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:13.1898225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:13.1898446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:13.1898668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:13.1898808Z fi_getinfo: -61 2022-11-23T04:05:13.1898925Z fi_getinfo: -61 2022-11-23T04:05:13.1899059Z fi_getinfo: -61 2022-11-23T04:05:13.1899192Z fi_getinfo: -61 2022-11-23T04:05:13.1899435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:13.1899722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:13.1899965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:13.1900362Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1900598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:13.1900975Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1901367Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1901759Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:13.1901860Z ok (13.843s) 2022-11-23T04:05:13.1902088Z test_serialize_and_deserialize (__main__.TestShardedTensorMetadata) ... ok (0.055s) 2022-11-23T04:05:13.1902113Z 2022-11-23T04:05:13.1902379Z ---------------------------------------------------------------------- 2022-11-23T04:05:13.1902496Z Ran 64 tests in 708.866s 2022-11-23T04:05:13.1902515Z 2022-11-23T04:05:13.1902620Z OK (skipped=1) 2022-11-23T04:05:13.1902639Z 2022-11-23T04:05:13.1902760Z Generating XML reports... 2022-11-23T04:05:13.1903305Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20221123035323.xml 2022-11-23T04:05:13.1903764Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20221123035323.xml 2022-11-23T04:05:13.1904406Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20221123035323.xml 2022-11-23T04:05:13.1904879Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20221123035323.xml 2022-11-23T04:05:13.1905341Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20221123035323.xml 2022-11-23T04:05:13.1905828Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20221123035323.xml 2022-11-23T04:05:13.1906322Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20221123035323.xml 2022-11-23T04:05:13.1906811Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20221123035323.xml 2022-11-23T04:05:13.1907307Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20221123035323.xml 2022-11-23T04:05:13.1907820Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20221123035323.xml 2022-11-23T04:05:13.1908337Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20221123035323.xml 2022-11-23T04:05:13.1908808Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20221123035323.xml 2022-11-23T04:05:13.1908849Z 2022-11-23T04:05:13.1909531Z ##[endgroup] 2022-11-23T04:05:13.1910078Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_sharded_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_sharded_tensor_ctwddu86) 2022-11-23T04:05:13.1910098Z 2022-11-23T04:05:13.5057514Z 2022-11-23T04:05:13.5057730Z real 11m56.921s 2022-11-23T04:05:13.5057985Z user 36m52.300s 2022-11-23T04:05:13.5058431Z sys 29m9.903s 2022-11-23T04:05:13.5059050Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-11-23T04:05:15.8450370Z Ignoring disabled issues: [] 2022-11-23T04:05:15.8995878Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:05:15.8996479Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:05:15.8996832Z Selected tests: 2022-11-23T04:05:15.8997137Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-11-23T04:05:15.9022115Z Prioritized test from test file changes. 2022-11-23T04:05:15.9022667Z reordering tests for PR: 2022-11-23T04:05:15.9022976Z prioritized: [] 2022-11-23T04:05:15.9023496Z the rest: ['distributed/_shard/sharded_tensor/test_sharded_tensor_reshard'] 2022-11-23T04:05:15.9023737Z 2022-11-23T04:05:15.9024631Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:05:15.9025569Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:05:15.9030299Z parallel (file granularity) tests: 2022-11-23T04:05:15.9030838Z 2022-11-23T04:05:15.9031086Z serial (file granularity) tests: 2022-11-23T04:05:15.9031431Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-11-23T04:05:18.2133140Z Ignoring disabled issues: [] 2022-11-23T04:05:18.6242463Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard ... [2022-11-23 04:05:18.623599] 2022-11-23T04:05:18.6245860Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:05:18.624103] 2022-11-23T04:05:31.7100161Z 2022-11-23T04:05:31.7101093Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-11-23T04:05:31.7103052Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_sharded_tensor_reshard (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_sharded_tensor_reshard_ldts5wzm) 2022-11-23T04:05:31.7103643Z 2022-11-23T04:05:31.7103778Z Running tests... 2022-11-23T04:05:31.7104590Z ---------------------------------------------------------------------- 2022-11-23T04:05:31.7105210Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard 2022-11-23T04:05:31.7105737Z test_sharded_tensor_reshard (__main__.TestReshard) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:05:31.7106184Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15751 2022-11-23T04:05:31.7106640Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15752 2022-11-23T04:05:31.7107086Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15753 2022-11-23T04:05:31.7107528Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15754 2022-11-23T04:05:31.7108137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7108596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7109174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7109630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7110213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7110659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7111520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7111988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7112568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7113006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7113580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7114022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7114600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7115035Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7115588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7116050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7116486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:31.7116981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:31.7117554Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:31.7118021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:31.7118480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:31.7118942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:31.7119431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:31.7119919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:31.7120576Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7121243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7121925Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7122601Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7122986Z ok (6.357s) 2022-11-23T04:05:31.7123391Z test_sharded_tensor_reshard_errors (__main__.TestReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16048 2022-11-23T04:05:31.7123911Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16049 2022-11-23T04:05:31.7124355Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16050 2022-11-23T04:05:31.7124778Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16051 2022-11-23T04:05:31.7125393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7125838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7126391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7126820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7127385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7127848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7128481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7128940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7129509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7129953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7130504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7130963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7131529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:31.7131966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:31.7132516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:31.7132973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:31.7133400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:31.7133853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:31.7134373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:31.7134853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:31.7135337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:31.7135797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:31.7136274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:31.7136764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:31.7137414Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7138077Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7138764Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7139438Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:31.7139810Z ok (4.318s) 2022-11-23T04:05:31.7139959Z 2022-11-23T04:05:31.7140224Z ---------------------------------------------------------------------- 2022-11-23T04:05:31.7140555Z Ran 2 tests in 10.675s 2022-11-23T04:05:31.7140718Z 2022-11-23T04:05:31.7140813Z OK 2022-11-23T04:05:31.7140932Z 2022-11-23T04:05:31.7141057Z Generating XML reports... 2022-11-23T04:05:31.7141673Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20221123040520.xml 2022-11-23T04:05:31.7142035Z 2022-11-23T04:05:31.7142358Z ##[endgroup] 2022-11-23T04:05:31.7143053Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_sharded_tensor_reshard (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_sharded_tensor_reshard_ldts5wzm) 2022-11-23T04:05:31.7143463Z 2022-11-23T04:05:32.0500748Z 2022-11-23T04:05:32.0500968Z real 0m18.544s 2022-11-23T04:05:32.0501836Z user 0m45.999s 2022-11-23T04:05:32.0502532Z sys 0m34.118s 2022-11-23T04:05:32.0503079Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T04:05:34.4142416Z Ignoring disabled issues: [] 2022-11-23T04:05:34.4683256Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:05:34.4683796Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:05:34.4684168Z Selected tests: 2022-11-23T04:05:34.4684489Z distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T04:05:34.4708761Z Prioritized test from test file changes. 2022-11-23T04:05:34.4709140Z reordering tests for PR: 2022-11-23T04:05:34.4709448Z prioritized: [] 2022-11-23T04:05:34.4709977Z the rest: ['distributed/_shard/sharded_tensor/ops/test_chunk'] 2022-11-23T04:05:34.4710214Z 2022-11-23T04:05:34.4710664Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:05:34.4711608Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:05:34.4720625Z parallel (file granularity) tests: 2022-11-23T04:05:34.4721435Z 2022-11-23T04:05:34.4721663Z serial (file granularity) tests: 2022-11-23T04:05:34.4721935Z distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T04:05:36.7702443Z Ignoring disabled issues: [] 2022-11-23T04:05:37.1741358Z Running distributed/_shard/sharded_tensor/ops/test_chunk ... [2022-11-23 04:05:37.173542] 2022-11-23T04:05:37.1742944Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_chunk.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:05:37.174023] 2022-11-23T04:05:50.9731994Z 2022-11-23T04:05:50.9732730Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T04:05:50.9733778Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_chunk (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_chunk_xysinub5) 2022-11-23T04:05:50.9734163Z 2022-11-23T04:05:50.9734322Z Running tests... 2022-11-23T04:05:50.9734844Z ---------------------------------------------------------------------- 2022-11-23T04:05:50.9735436Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk 2022-11-23T04:05:50.9735976Z test_sharded_chunk (__main__.TestShardedTensorChunkOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:05:50.9736511Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16545 2022-11-23T04:05:50.9736967Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16546 2022-11-23T04:05:50.9737423Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16547 2022-11-23T04:05:50.9737850Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16548 2022-11-23T04:05:50.9738454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9738916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9739494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9739967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9740525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9740973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9741550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9742006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9742569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9761521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9762466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9762977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9763555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9764021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9764603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9765083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9765508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:50.9765987Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:50.9766516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:50.9766990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:50.9767435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:50.9767889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:50.9768463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:50.9768942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:50.9769581Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9770266Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9770949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9771633Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9772004Z ok (7.062s) 2022-11-23T04:05:50.9772459Z test_sharded_chunk_error (__main__.TestShardedTensorChunkOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16834 2022-11-23T04:05:50.9772998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16835 2022-11-23T04:05:50.9773422Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16836 2022-11-23T04:05:50.9773873Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16837 2022-11-23T04:05:50.9774456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9774888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9775431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9775879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9776442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9776860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9777427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9777870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9778421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9778888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9779443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9779875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9780432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:05:50.9780845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:05:50.9781399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:05:50.9781858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:05:50.9782266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:05:50.9782738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:05:50.9783207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:05:50.9783676Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:05:50.9784426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:05:50.9784925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:05:50.9785511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:05:50.9785984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:05:50.9786654Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9787340Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9788024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9788683Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:05:50.9789073Z ok (4.317s) 2022-11-23T04:05:50.9789229Z 2022-11-23T04:05:50.9789511Z ---------------------------------------------------------------------- 2022-11-23T04:05:50.9789843Z Ran 2 tests in 11.379s 2022-11-23T04:05:50.9789990Z 2022-11-23T04:05:50.9790085Z OK 2022-11-23T04:05:50.9790220Z 2022-11-23T04:05:50.9790349Z Generating XML reports... 2022-11-23T04:05:50.9791012Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk/TEST-TestShardedTensorChunkOps-20221123040539.xml 2022-11-23T04:05:50.9791398Z 2022-11-23T04:05:50.9791724Z ##[endgroup] 2022-11-23T04:05:50.9792379Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_chunk (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_chunk_xysinub5) 2022-11-23T04:05:50.9792770Z 2022-11-23T04:05:51.3278001Z 2022-11-23T04:05:51.3278584Z real 0m19.277s 2022-11-23T04:05:51.3278954Z user 0m47.955s 2022-11-23T04:05:51.3279207Z sys 0m34.159s 2022-11-23T04:05:51.3279857Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T04:05:53.7302107Z Ignoring disabled issues: [] 2022-11-23T04:05:53.7848078Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:05:53.7848650Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:05:53.7849010Z Selected tests: 2022-11-23T04:05:53.7849326Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T04:05:53.7881686Z Prioritized test from test file changes. 2022-11-23T04:05:53.7882570Z reordering tests for PR: 2022-11-23T04:05:53.7882875Z prioritized: [] 2022-11-23T04:05:53.7883534Z the rest: ['distributed/_shard/sharded_tensor/ops/test_elementwise_ops'] 2022-11-23T04:05:53.7883798Z 2022-11-23T04:05:53.7884245Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:05:53.7885219Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:05:53.7888588Z parallel (file granularity) tests: 2022-11-23T04:05:53.7888928Z 2022-11-23T04:05:53.7889636Z serial (file granularity) tests: 2022-11-23T04:05:53.7890047Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T04:05:56.0862908Z Ignoring disabled issues: [] 2022-11-23T04:05:56.5517189Z Running distributed/_shard/sharded_tensor/ops/test_elementwise_ops ... [2022-11-23 04:05:56.551109] 2022-11-23T04:05:56.5519298Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_elementwise_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:05:56.551566] 2022-11-23T04:06:18.2765292Z 2022-11-23T04:06:18.2765753Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T04:06:18.2767082Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_elementwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_elementwise_ops_xbd_5snm) 2022-11-23T04:06:18.2773148Z 2022-11-23T04:06:18.2774275Z Running tests... 2022-11-23T04:06:18.2774932Z ---------------------------------------------------------------------- 2022-11-23T04:06:18.2775602Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops 2022-11-23T04:06:18.2776236Z test_sharded_dropout (__main__.TestShardedTensorElementWiseOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:06:18.2776760Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17331 2022-11-23T04:06:18.2777275Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17332 2022-11-23T04:06:18.2777918Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17333 2022-11-23T04:06:18.2778402Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17334 2022-11-23T04:06:18.2779021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2779486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2779962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2780426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2780987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2781470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2782134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2782559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2783200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2783562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2784621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2785064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2785909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2786394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2787001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2787455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2787903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:18.2788385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:18.2788904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:18.2789403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:18.2789806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:18.2790287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:18.2790871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:18.2791329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:18.2792245Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2792943Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2793529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2794290Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2794598Z ok (6.067s) 2022-11-23T04:06:18.2795072Z test_sharded_gelu (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17616 2022-11-23T04:06:18.2795629Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17617 2022-11-23T04:06:18.2796060Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17618 2022-11-23T04:06:18.2796590Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17619 2022-11-23T04:06:18.2797203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2797644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2798151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2798626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2799208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2799639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2800216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2800702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2801357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2801711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2802373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2802763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2803384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2803852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2804427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2804912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2805320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:18.2805911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:18.2806283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:18.2806737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:18.2807277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:18.2807728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:18.2808277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:18.2808693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:18.2809513Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2810117Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2810908Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2811551Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2811964Z ok (4.418s) 2022-11-23T04:06:18.2812343Z test_sharded_relu (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17901 2022-11-23T04:06:18.2812872Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17902 2022-11-23T04:06:18.2813330Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17903 2022-11-23T04:06:18.2813760Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17904 2022-11-23T04:06:18.2814453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2814808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2815482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2815869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2816453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2816878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2817455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2817992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2818487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2819021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2819598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2819972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2820668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2821046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2821623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2822070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2822515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:18.2823072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:18.2823472Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:18.2824304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:18.2824710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:18.2825304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:18.2825702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:18.2826172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:18.2826936Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2827697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2828292Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2829061Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2829383Z ok (4.418s) 2022-11-23T04:06:18.2829864Z test_sharded_tensor_nan_to_num (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18186 2022-11-23T04:06:18.2830409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18187 2022-11-23T04:06:18.2830863Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18188 2022-11-23T04:06:18.2831319Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18189 2022-11-23T04:06:18.2831927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2832363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2832943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2833419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2834016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2834739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2835330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2835802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2836356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2836798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2837363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2837809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2838484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:18.2838944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:18.2839520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:18.2839969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:18.2840463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:18.2840905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:18.2841385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:18.2841856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:18.2842361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:18.2842852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:18.2843312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:18.2843821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:18.2844552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2845245Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2845911Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2846597Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:18.2846990Z ok (4.417s) 2022-11-23T04:06:18.2847136Z 2022-11-23T04:06:18.2847412Z ---------------------------------------------------------------------- 2022-11-23T04:06:18.2847726Z Ran 4 tests in 19.321s 2022-11-23T04:06:18.2847893Z 2022-11-23T04:06:18.2847992Z OK 2022-11-23T04:06:18.2848128Z 2022-11-23T04:06:18.2848256Z Generating XML reports... 2022-11-23T04:06:18.2848943Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops/TEST-TestShardedTensorElementWiseOps-20221123040558.xml 2022-11-23T04:06:18.2849379Z 2022-11-23T04:06:18.2850105Z ##[endgroup] 2022-11-23T04:06:18.2850719Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_elementwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_elementwise_ops_xbd_5snm) 2022-11-23T04:06:18.2851138Z 2022-11-23T04:06:18.6788944Z 2022-11-23T04:06:18.6789350Z real 0m27.351s 2022-11-23T04:06:18.6789712Z user 1m14.671s 2022-11-23T04:06:18.6789968Z sys 0m51.244s 2022-11-23T04:06:18.6790554Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_embedding 2022-11-23T04:06:21.0184349Z Ignoring disabled issues: [] 2022-11-23T04:06:21.0721224Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:06:21.0721835Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:06:21.0722168Z Selected tests: 2022-11-23T04:06:21.0722484Z distributed/_shard/sharded_tensor/ops/test_embedding 2022-11-23T04:06:21.0748908Z Prioritized test from test file changes. 2022-11-23T04:06:21.0749765Z reordering tests for PR: 2022-11-23T04:06:21.0750078Z prioritized: [] 2022-11-23T04:06:21.0750623Z the rest: ['distributed/_shard/sharded_tensor/ops/test_embedding'] 2022-11-23T04:06:21.0750864Z 2022-11-23T04:06:21.0751698Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:06:21.0752654Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:06:21.0759552Z parallel (file granularity) tests: 2022-11-23T04:06:21.0759862Z 2022-11-23T04:06:21.0760995Z serial (file granularity) tests: 2022-11-23T04:06:21.0761328Z distributed/_shard/sharded_tensor/ops/test_embedding 2022-11-23T04:06:23.3876535Z Ignoring disabled issues: [] 2022-11-23T04:06:23.7897288Z Running distributed/_shard/sharded_tensor/ops/test_embedding ... [2022-11-23 04:06:23.789121] 2022-11-23T04:06:23.7898499Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:06:23.789588] 2022-11-23T04:06:37.7100178Z 2022-11-23T04:06:37.7100820Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_embedding 2022-11-23T04:06:37.7102104Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_kd6kon3q) 2022-11-23T04:06:37.7102521Z 2022-11-23T04:06:37.7102618Z Running tests... 2022-11-23T04:06:37.7103497Z ---------------------------------------------------------------------- 2022-11-23T04:06:37.7104511Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding 2022-11-23T04:06:37.7105080Z test_sharded_embedding_colwise (__main__.TestShardedEmbedding) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:06:37.7105557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18687 2022-11-23T04:06:37.7106007Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18688 2022-11-23T04:06:37.7106453Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18689 2022-11-23T04:06:37.7106874Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18690 2022-11-23T04:06:37.7107503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7107961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7108548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7109002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7109620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7110073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7110650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7111099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7111674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7112120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7112690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7113135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7113709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7114149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7114711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7115274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7115725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:37.7116222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:37.7116693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:37.7117172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:37.7117655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:37.7118141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:37.7118598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:37.7119080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:37.7119743Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7120434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7121102Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7121870Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7122265Z ok (6.703s) 2022-11-23T04:06:37.7122697Z test_sharded_embedding_rowwise (__main__.TestShardedEmbedding) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18980 2022-11-23T04:06:37.7123238Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18981 2022-11-23T04:06:37.7123693Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18982 2022-11-23T04:06:37.7124138Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18983 2022-11-23T04:06:37.7124722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7125175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7125755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7126225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7126783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7127222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7127791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7128237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7128811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7129250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7129820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7130260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7130831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:37.7131276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:37.7131885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:37.7132356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:37.7132796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:37.7133293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:37.7133742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:37.7134222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:37.7134714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:37.7135200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:37.7135665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:37.7136150Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:37.7136807Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7137474Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7138222Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7138902Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:37.7139812Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:37.7140360Z warnings.warn( 2022-11-23T04:06:37.7141095Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:37.7141633Z warnings.warn( 2022-11-23T04:06:37.7142368Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:37.7142910Z warnings.warn( 2022-11-23T04:06:37.7143625Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:37.7144351Z warnings.warn( 2022-11-23T04:06:37.7144591Z ok (4.819s) 2022-11-23T04:06:37.7144741Z 2022-11-23T04:06:37.7145005Z ---------------------------------------------------------------------- 2022-11-23T04:06:37.7145340Z Ran 2 tests in 11.523s 2022-11-23T04:06:37.7145507Z 2022-11-23T04:06:37.7145602Z OK 2022-11-23T04:06:37.7145737Z 2022-11-23T04:06:37.7145863Z Generating XML reports... 2022-11-23T04:06:37.7146491Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding/TEST-TestShardedEmbedding-20221123040625.xml 2022-11-23T04:06:37.7146878Z 2022-11-23T04:06:37.7147192Z ##[endgroup] 2022-11-23T04:06:37.7147867Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_kd6kon3q) 2022-11-23T04:06:37.7148274Z 2022-11-23T04:06:38.0641131Z 2022-11-23T04:06:38.0641690Z real 0m19.385s 2022-11-23T04:06:38.0642067Z user 0m48.484s 2022-11-23T04:06:38.0642311Z sys 0m34.875s 2022-11-23T04:06:38.0643254Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T04:06:40.4107878Z Ignoring disabled issues: [] 2022-11-23T04:06:40.4654905Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:06:40.4655474Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:06:40.4655853Z Selected tests: 2022-11-23T04:06:40.4656171Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T04:06:40.4686374Z Prioritized test from test file changes. 2022-11-23T04:06:40.4686755Z reordering tests for PR: 2022-11-23T04:06:40.4687052Z prioritized: [] 2022-11-23T04:06:40.4687552Z the rest: ['distributed/_shard/sharded_tensor/ops/test_embedding_bag'] 2022-11-23T04:06:40.4687810Z 2022-11-23T04:06:40.4688410Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:06:40.4689326Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:06:40.4692807Z parallel (file granularity) tests: 2022-11-23T04:06:40.4693160Z 2022-11-23T04:06:40.4693434Z serial (file granularity) tests: 2022-11-23T04:06:40.4693803Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T04:06:42.7621234Z Ignoring disabled issues: [] 2022-11-23T04:06:43.1951799Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag ... [2022-11-23 04:06:43.194447] 2022-11-23T04:06:43.1952632Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:06:43.194929] 2022-11-23T04:06:57.7937350Z 2022-11-23T04:06:57.7938224Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T04:06:57.7939283Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding_bag (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_bag_67vwqzza) 2022-11-23T04:06:57.7939703Z 2022-11-23T04:06:57.7939818Z Running tests... 2022-11-23T04:06:57.7940490Z ---------------------------------------------------------------------- 2022-11-23T04:06:57.7941181Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag 2022-11-23T04:06:57.7941730Z test_sharded_embedding_bag_colwise (__main__.TestShardedEmbeddingBag) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:06:57.7942236Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19481 2022-11-23T04:06:57.7942708Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19482 2022-11-23T04:06:57.7943127Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19483 2022-11-23T04:06:57.7943597Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19484 2022-11-23T04:06:57.7944696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7945164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7945730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7946212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7946804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7947255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7947837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7948649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7949276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7949776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7950336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7951018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7952154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7953038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7954254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7954750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7955208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:57.7955653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:57.7956139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:57.7956625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:57.7957262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:57.7957728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:57.7958217Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:57.7958703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:57.7959457Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7960149Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7960834Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7961524Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7961896Z ok (7.039s) 2022-11-23T04:06:57.7962358Z test_sharded_embedding_bag_rowwise (__main__.TestShardedEmbeddingBag) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19774 2022-11-23T04:06:57.7962909Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19775 2022-11-23T04:06:57.7963363Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19776 2022-11-23T04:06:57.7963790Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19777 2022-11-23T04:06:57.7964407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7964858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7965406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7965861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7966433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7966905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7967472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7968012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7968639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7969086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7969642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7970115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7970696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:06:57.7971138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:06:57.7971695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:06:57.7972157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:06:57.7972599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:06:57.7973057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:06:57.7973545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:06:57.7974136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:06:57.7974651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:06:57.7975143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:06:57.7975655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:06:57.7976183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:06:57.7976886Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7977600Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7978329Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7979057Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:06:57.7980019Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:57.7980581Z warnings.warn( 2022-11-23T04:06:57.7981371Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:57.7981949Z warnings.warn( 2022-11-23T04:06:57.7982744Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:57.7983306Z warnings.warn( 2022-11-23T04:06:57.7984463Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:06:57.7985037Z warnings.warn( 2022-11-23T04:06:57.7985279Z ok (5.119s) 2022-11-23T04:06:57.7985428Z 2022-11-23T04:06:57.7985694Z ---------------------------------------------------------------------- 2022-11-23T04:06:57.7986139Z Ran 2 tests in 12.159s 2022-11-23T04:06:57.7986317Z 2022-11-23T04:06:57.7986412Z OK 2022-11-23T04:06:57.7986545Z 2022-11-23T04:06:57.7986651Z Generating XML reports... 2022-11-23T04:06:57.7987324Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20221123040645.xml 2022-11-23T04:06:57.7987723Z 2022-11-23T04:06:57.7988081Z ##[endgroup] 2022-11-23T04:06:57.7988756Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding_bag (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_bag_67vwqzza) 2022-11-23T04:06:57.7989164Z 2022-11-23T04:06:58.1824544Z 2022-11-23T04:06:58.1825508Z real 0m20.118s 2022-11-23T04:06:58.1825836Z user 0m51.125s 2022-11-23T04:06:58.1826060Z sys 0m31.613s 2022-11-23T04:06:58.1826667Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T04:07:00.5202872Z Ignoring disabled issues: [] 2022-11-23T04:07:00.5745598Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:07:00.5746163Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:07:00.5746548Z Selected tests: 2022-11-23T04:07:00.5746863Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T04:07:00.5772024Z Prioritized test from test file changes. 2022-11-23T04:07:00.5772397Z reordering tests for PR: 2022-11-23T04:07:00.5772666Z prioritized: [] 2022-11-23T04:07:00.5773188Z the rest: ['distributed/_shard/sharded_tensor/ops/test_binary_cmp'] 2022-11-23T04:07:00.5773406Z 2022-11-23T04:07:00.5773935Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:07:00.5774886Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:07:00.5779676Z parallel (file granularity) tests: 2022-11-23T04:07:00.5779999Z 2022-11-23T04:07:00.5780227Z serial (file granularity) tests: 2022-11-23T04:07:00.5780560Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T04:07:02.9238518Z Ignoring disabled issues: [] 2022-11-23T04:07:03.3295712Z Running distributed/_shard/sharded_tensor/ops/test_binary_cmp ... [2022-11-23 04:07:03.328973] 2022-11-23T04:07:03.3299777Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:07:03.329440] 2022-11-23T04:08:02.8334934Z 2022-11-23T04:08:02.8335745Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T04:08:02.8336807Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_binary_cmp (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_binary_cmp_rznn830u) 2022-11-23T04:08:02.8337215Z 2022-11-23T04:08:02.8337310Z Running tests... 2022-11-23T04:08:02.8338391Z ---------------------------------------------------------------------- 2022-11-23T04:08:02.8338996Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp 2022-11-23T04:08:02.8339488Z test_torch_allclose (__main__.TestShardedTensorBinaryOps) 2022-11-23T04:08:02.8347315Z Test torch.allclose(ShardedTensor, ShardedTensor) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:08:02.8347860Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20275 2022-11-23T04:08:02.8348321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20276 2022-11-23T04:08:02.8348783Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20277 2022-11-23T04:08:02.8349467Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20278 2022-11-23T04:08:02.8350247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8350792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8351379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8351914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8352501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8352951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8353508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8353972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8354548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8355005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8355562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8356161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8356741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8357190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8357741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8358199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8358643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:02.8359181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:02.8359632Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:02.8360636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:02.8361035Z fi_getinfo: -61 2022-11-23T04:08:02.8361291Z fi_getinfo: -61 2022-11-23T04:08:02.8361580Z fi_getinfo: -61 2022-11-23T04:08:02.8361847Z fi_getinfo: -61 2022-11-23T04:08:02.8362206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:02.8362694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:02.8363180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:02.8363836Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8364352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:02.8365005Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8365693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8366382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8366752Z ok (15.846s) 2022-11-23T04:08:02.8367217Z test_torch_allclose_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20744 2022-11-23T04:08:02.8367850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20745 2022-11-23T04:08:02.8368305Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20746 2022-11-23T04:08:02.8368726Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20747 2022-11-23T04:08:02.8369327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8369805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8370355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8370817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8371437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8371876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8372428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8372928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8373497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8373916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8374662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8375107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8375775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8376215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8376780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8377221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8377653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:02.8378193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:02.8378689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:02.8379137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:02.8379519Z fi_getinfo: -61 2022-11-23T04:08:02.8379792Z fi_getinfo: -61 2022-11-23T04:08:02.8380044Z fi_getinfo: -61 2022-11-23T04:08:02.8380309Z fi_getinfo: -61 2022-11-23T04:08:02.8380686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:02.8381164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:02.8381817Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8382349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:02.8382835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:02.8383467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8384548Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8385233Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8385857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:08:02.8386340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:08:02.8386819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:08:02.8387464Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8387972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:08:02.8388616Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8389287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8389808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T04:08:02.8390277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T04:08:02.8390754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T04:08:02.8391393Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8391915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T04:08:02.8392628Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8393294Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8393964Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8394634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8395002Z ok (13.743s) 2022-11-23T04:08:02.8395316Z test_torch_equal (__main__.TestShardedTensorBinaryOps) 2022-11-23T04:08:02.8395822Z Test torch.equal(ShardedTensor, ShardedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21221 2022-11-23T04:08:02.8396328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21222 2022-11-23T04:08:02.8396773Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21223 2022-11-23T04:08:02.8397209Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21224 2022-11-23T04:08:02.8397819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8398308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8398898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8399366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8399922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8400364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8400933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8401395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8401947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8402388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8403078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8403549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8404112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8404553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8405117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8405564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8405999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:02.8406468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:02.8406932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:02.8407379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:02.8407762Z fi_getinfo: -61 2022-11-23T04:08:02.8408034Z fi_getinfo: -61 2022-11-23T04:08:02.8408287Z fi_getinfo: -61 2022-11-23T04:08:02.8408551Z fi_getinfo: -61 2022-11-23T04:08:02.8408928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:02.8409468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:02.8409952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:02.8410608Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8411135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:02.8411767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8412438Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8413137Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8413516Z ok (13.840s) 2022-11-23T04:08:02.8413959Z test_torch_equal_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21690 2022-11-23T04:08:02.8414504Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21691 2022-11-23T04:08:02.8414950Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21692 2022-11-23T04:08:02.8415374Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21693 2022-11-23T04:08:02.8415979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8416423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8416991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8417442Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8418025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8418466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8419012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8419474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8420049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8420544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8421149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8421641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8422297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:02.8422770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:02.8423357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:02.8424045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:02.8424506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:02.8424967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:02.8425429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:02.8425884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:02.8426280Z fi_getinfo: -61 2022-11-23T04:08:02.8426534Z fi_getinfo: -61 2022-11-23T04:08:02.8426799Z fi_getinfo: -61 2022-11-23T04:08:02.8427200Z fi_getinfo: -61 2022-11-23T04:08:02.8427562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:02.8428051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:02.8428535Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:02.8429171Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8429707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:02.8430340Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8431017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8431684Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:02.8432202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:08:02.8432684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:08:02.8433166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:08:02.8433803Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8434327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:08:02.8434964Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8435639Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8436299Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:08:02.8436819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T04:08:02.8437301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T04:08:02.8437837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T04:08:02.8438327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T04:08:02.8438968Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8439645Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8440303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8440978Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:08:02.8441366Z ok (13.639s) 2022-11-23T04:08:02.8441518Z 2022-11-23T04:08:02.8441787Z ---------------------------------------------------------------------- 2022-11-23T04:08:02.8442102Z Ran 4 tests in 57.069s 2022-11-23T04:08:02.8442280Z 2022-11-23T04:08:02.8442375Z OK 2022-11-23T04:08:02.8442508Z 2022-11-23T04:08:02.8442632Z Generating XML reports... 2022-11-23T04:08:02.8443282Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20221123040705.xml 2022-11-23T04:08:02.8443687Z 2022-11-23T04:08:02.8444185Z ##[endgroup] 2022-11-23T04:08:02.8444863Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_binary_cmp (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_binary_cmp_rznn830u) 2022-11-23T04:08:02.8445374Z 2022-11-23T04:08:03.2364906Z 2022-11-23T04:08:03.2365630Z real 1m5.054s 2022-11-23T04:08:03.2366213Z user 3m3.257s 2022-11-23T04:08:03.2366590Z sys 2m32.721s 2022-11-23T04:08:03.2367202Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T04:08:05.5979797Z Ignoring disabled issues: [] 2022-11-23T04:08:05.6523160Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:08:05.6523735Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:08:05.6524097Z Selected tests: 2022-11-23T04:08:05.6524404Z distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T04:08:05.6552170Z Prioritized test from test file changes. 2022-11-23T04:08:05.6552511Z reordering tests for PR: 2022-11-23T04:08:05.6552782Z prioritized: [] 2022-11-23T04:08:05.6553304Z the rest: ['distributed/_shard/sharded_tensor/ops/test_init'] 2022-11-23T04:08:05.6553511Z 2022-11-23T04:08:05.6554042Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:08:05.6554976Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:08:05.6561821Z parallel (file granularity) tests: 2022-11-23T04:08:05.6562718Z 2022-11-23T04:08:05.6563288Z serial (file granularity) tests: 2022-11-23T04:08:05.6563778Z distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T04:08:07.9704506Z Ignoring disabled issues: [] 2022-11-23T04:08:08.3958103Z Running distributed/_shard/sharded_tensor/ops/test_init ... [2022-11-23 04:08:08.395167] 2022-11-23T04:08:08.3959307Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:08:08.395651] 2022-11-23T04:08:53.6381879Z 2022-11-23T04:08:53.6382529Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T04:08:53.6386458Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_init (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_init_qlchnsb1) 2022-11-23T04:08:53.6387152Z 2022-11-23T04:08:53.6387289Z Running tests... 2022-11-23T04:08:53.6387807Z ---------------------------------------------------------------------- 2022-11-23T04:08:53.6388400Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init 2022-11-23T04:08:53.6388915Z test_init_sharded_tensor_with_kaiming_uniform (__main__.TestShardedTensorNNInit) 2022-11-23T04:08:53.6389441Z Test torch.nn.init.kaiming_uniform_(ShardedTensor, a, mode, nonlinearit) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:08:53.6390292Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22379 2022-11-23T04:08:53.6390759Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22380 2022-11-23T04:08:53.6391214Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22381 2022-11-23T04:08:53.6391638Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22382 2022-11-23T04:08:53.6392277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6394977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6395646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6396153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6396986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6397423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6398016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6398507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6399102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6399537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6400117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6400568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6401156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6401611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6402201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6402680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6403130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:53.6403593Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:53.6404096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:53.6404565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:53.6411435Z fi_getinfo: -61 2022-11-23T04:08:53.6412020Z fi_getinfo: -61 2022-11-23T04:08:53.6412653Z fi_getinfo: -61 2022-11-23T04:08:53.6413215Z fi_getinfo: -61 2022-11-23T04:08:53.6413910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:53.6414926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:53.6415426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:53.6416217Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6416753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:53.6417409Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6418091Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6418784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6419153Z ok (15.478s) 2022-11-23T04:08:53.6419497Z test_init_sharded_tensor_with_normal (__main__.TestShardedTensorNNInit) 2022-11-23T04:08:53.6420039Z Test torch.nn.init.normal_(ShardedTensor, mean, std) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22848 2022-11-23T04:08:53.6420553Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22849 2022-11-23T04:08:53.6421010Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22850 2022-11-23T04:08:53.6421461Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22851 2022-11-23T04:08:53.6422158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6422672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6423253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6423728Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6424733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6425188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6425758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6426213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6426770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6427246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6427839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6428312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6428984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6429440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6430012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6430457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6430892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:53.6431367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:53.6431839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:53.6432288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:53.6432669Z fi_getinfo: -61 2022-11-23T04:08:53.6432941Z fi_getinfo: -61 2022-11-23T04:08:53.6433193Z fi_getinfo: -61 2022-11-23T04:08:53.6433455Z fi_getinfo: -61 2022-11-23T04:08:53.6433830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:53.6434430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:53.6434965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:53.6435617Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6436143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:53.6436770Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6437447Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6438120Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6438506Z ok (13.742s) 2022-11-23T04:08:53.6438828Z test_init_sharded_tensor_with_uniform (__main__.TestShardedTensorNNInit) 2022-11-23T04:08:53.6439348Z Test torch.nn.init.uniform_(ShardedTensor, a, b) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23317 2022-11-23T04:08:53.6439863Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23318 2022-11-23T04:08:53.6440305Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23319 2022-11-23T04:08:53.6440812Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23320 2022-11-23T04:08:53.6441413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6441859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6442411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6443007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6443588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6444028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6444578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6445045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6445611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6446051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6446593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6447053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6447622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:08:53.6448101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:08:53.6448665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:08:53.6449118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:08:53.6449555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:08:53.6450005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:08:53.6450459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:08:53.6450921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:08:53.6451288Z fi_getinfo: -61 2022-11-23T04:08:53.6451557Z fi_getinfo: -61 2022-11-23T04:08:53.6451879Z fi_getinfo: -61 2022-11-23T04:08:53.6452136Z fi_getinfo: -61 2022-11-23T04:08:53.6452507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:08:53.6452995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:08:53.6453476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:08:53.6453945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:08:53.6454586Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6455260Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6455940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6456593Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:08:53.6456979Z ok (13.641s) 2022-11-23T04:08:53.6457127Z 2022-11-23T04:08:53.6457393Z ---------------------------------------------------------------------- 2022-11-23T04:08:53.6457705Z Ran 3 tests in 42.861s 2022-11-23T04:08:53.6457935Z 2022-11-23T04:08:53.6458027Z OK 2022-11-23T04:08:53.6458159Z 2022-11-23T04:08:53.6458286Z Generating XML reports... 2022-11-23T04:08:53.6458984Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20221123040810.xml 2022-11-23T04:08:53.6459355Z 2022-11-23T04:08:53.6459912Z ##[endgroup] 2022-11-23T04:08:53.6460564Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_init (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_init_qlchnsb1) 2022-11-23T04:08:53.6460948Z 2022-11-23T04:08:53.9952579Z 2022-11-23T04:08:53.9953135Z real 0m50.759s 2022-11-23T04:08:53.9953481Z user 2m22.105s 2022-11-23T04:08:53.9953704Z sys 1m57.628s 2022-11-23T04:08:53.9954280Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_linear 2022-11-23T04:08:56.3938194Z Ignoring disabled issues: [] 2022-11-23T04:08:56.4481313Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:08:56.4481916Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:08:56.4482273Z Selected tests: 2022-11-23T04:08:56.4482570Z distributed/_shard/sharded_tensor/ops/test_linear 2022-11-23T04:08:56.4506067Z Prioritized test from test file changes. 2022-11-23T04:08:56.4506406Z reordering tests for PR: 2022-11-23T04:08:56.4506671Z prioritized: [] 2022-11-23T04:08:56.4507160Z the rest: ['distributed/_shard/sharded_tensor/ops/test_linear'] 2022-11-23T04:08:56.4507398Z 2022-11-23T04:08:56.4507935Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:08:56.4508868Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:08:56.4513834Z parallel (file granularity) tests: 2022-11-23T04:08:56.4514464Z 2022-11-23T04:08:56.4514722Z serial (file granularity) tests: 2022-11-23T04:08:56.4515039Z distributed/_shard/sharded_tensor/ops/test_linear 2022-11-23T04:08:58.6981303Z Ignoring disabled issues: [] 2022-11-23T04:08:59.1228152Z Running distributed/_shard/sharded_tensor/ops/test_linear ... [2022-11-23 04:08:59.122278] 2022-11-23T04:08:59.1233110Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_linear.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:08:59.122783] 2022-11-23T04:09:18.8721829Z 2022-11-23T04:09:18.8722611Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_linear 2022-11-23T04:09:18.8726516Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_linear (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_linear_1te_fl6z) 2022-11-23T04:09:18.8726969Z 2022-11-23T04:09:18.8727087Z Running tests... 2022-11-23T04:09:18.8727873Z ---------------------------------------------------------------------- 2022-11-23T04:09:18.8728606Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear 2022-11-23T04:09:18.8729630Z test_sharded_linear_colwise (__main__.TestShardedTensorOpsLinear) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:09:18.8730538Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23998 2022-11-23T04:09:18.8731306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23999 2022-11-23T04:09:18.8731763Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24000 2022-11-23T04:09:18.8732641Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24001 2022-11-23T04:09:18.8733497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8734249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8734834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8735689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8736319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8737200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8738243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8739070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8739673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8740356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8741553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8742234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8742818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8743265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8744263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8744749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8745172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:09:18.8745646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:09:18.8746119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:09:18.8746568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:09:18.8747048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:09:18.8747542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:09:18.8748172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:09:18.8748653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:09:18.8749322Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8750009Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8750698Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8751356Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8752267Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:09:18.8752814Z warnings.warn( 2022-11-23T04:09:18.8753569Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:09:18.8754088Z warnings.warn( 2022-11-23T04:09:18.8754830Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:09:18.8755457Z warnings.warn( 2022-11-23T04:09:18.8756196Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T04:09:18.8756710Z warnings.warn( 2022-11-23T04:09:18.8756947Z ok (7.422s) 2022-11-23T04:09:18.8757401Z test_sharded_linear_errors (__main__.TestShardedTensorOpsLinear) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24307 2022-11-23T04:09:18.8757950Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24308 2022-11-23T04:09:18.8758377Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24309 2022-11-23T04:09:18.8758820Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24310 2022-11-23T04:09:18.8759425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8759859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8760431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8760899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8761479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8761903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8762473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8762941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8763656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8764094Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8764662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8765121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8765747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8766202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8766767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8767227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8767648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:09:18.8768134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:09:18.8768617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:09:18.8769055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:09:18.8769521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:09:18.8769997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:09:18.8770486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:09:18.8770948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:09:18.8771676Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8772358Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8773039Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8773687Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8774080Z ok (4.520s) 2022-11-23T04:09:18.8774539Z test_sharded_linear_rowwise (__main__.TestShardedTensorOpsLinear) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24596 2022-11-23T04:09:18.8775070Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24597 2022-11-23T04:09:18.8775521Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24598 2022-11-23T04:09:18.8775963Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24599 2022-11-23T04:09:18.8776571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8777002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8777571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8778038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8778610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8779034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8779599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8780063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8780613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8781063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8781628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8782086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8782688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:09:18.8783142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:09:18.8783711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:09:18.8784367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:09:18.8784809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:09:18.8785283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:09:18.8785749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:09:18.8786197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:09:18.8786686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:09:18.8787176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:09:18.8787663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:09:18.8788128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:09:18.8788883Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8789565Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8790230Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8790901Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:09:18.8791299Z ok (5.321s) 2022-11-23T04:09:18.8791451Z 2022-11-23T04:09:18.8791725Z ---------------------------------------------------------------------- 2022-11-23T04:09:18.8792038Z Ran 3 tests in 17.264s 2022-11-23T04:09:18.8792204Z 2022-11-23T04:09:18.8792298Z OK 2022-11-23T04:09:18.8792430Z 2022-11-23T04:09:18.8792555Z Generating XML reports... 2022-11-23T04:09:18.8793199Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear/TEST-TestShardedTensorOpsLinear-20221123040901.xml 2022-11-23T04:09:18.8793599Z 2022-11-23T04:09:18.8793937Z ##[endgroup] 2022-11-23T04:09:18.8794593Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_linear (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_linear_1te_fl6z) 2022-11-23T04:09:18.8794979Z 2022-11-23T04:09:19.2658766Z 2022-11-23T04:09:19.2659424Z real 0m25.271s 2022-11-23T04:09:19.2659711Z user 1m7.609s 2022-11-23T04:09:19.2659976Z sys 0m41.009s 2022-11-23T04:09:19.2660572Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T04:09:21.6638012Z Ignoring disabled issues: [] 2022-11-23T04:09:21.7180176Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:09:21.7180809Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:09:21.7181177Z Selected tests: 2022-11-23T04:09:21.7181464Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T04:09:21.7213030Z Prioritized test from test file changes. 2022-11-23T04:09:21.7213444Z reordering tests for PR: 2022-11-23T04:09:21.7213721Z prioritized: [] 2022-11-23T04:09:21.7214329Z the rest: ['distributed/_shard/sharded_tensor/ops/test_math_ops'] 2022-11-23T04:09:21.7214569Z 2022-11-23T04:09:21.7215427Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:09:21.7216421Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:09:21.7220029Z parallel (file granularity) tests: 2022-11-23T04:09:21.7220348Z 2022-11-23T04:09:21.7220674Z serial (file granularity) tests: 2022-11-23T04:09:21.7221033Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T04:09:24.0480853Z Ignoring disabled issues: [] 2022-11-23T04:09:24.4885053Z Running distributed/_shard/sharded_tensor/ops/test_math_ops ... [2022-11-23 04:09:24.487894] 2022-11-23T04:09:24.4886335Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_math_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:09:24.488374] 2022-11-23T04:09:26.7400948Z 2022-11-23T04:09:26.7401666Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T04:09:26.7402703Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_gsnjq4jg) 2022-11-23T04:09:26.7403093Z 2022-11-23T04:09:26.7403396Z ##[endgroup] 2022-11-23T04:09:26.7404177Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_gsnjq4jg) 2022-11-23T04:09:26.7404885Z 2022-11-23T04:09:27.0854688Z 2022-11-23T04:09:27.0855466Z real 0m7.819s 2022-11-23T04:09:27.0855750Z user 0m13.546s 2022-11-23T04:09:27.0855977Z sys 0m12.364s 2022-11-23T04:09:27.0856543Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T04:09:29.4569245Z Ignoring disabled issues: [] 2022-11-23T04:09:29.5115441Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:09:29.5116020Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:09:29.5116374Z Selected tests: 2022-11-23T04:09:29.5116674Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T04:09:29.5139521Z Prioritized test from test file changes. 2022-11-23T04:09:29.5139869Z reordering tests for PR: 2022-11-23T04:09:29.5140162Z prioritized: [] 2022-11-23T04:09:29.5140669Z the rest: ['distributed/_shard/sharded_tensor/ops/test_matrix_ops'] 2022-11-23T04:09:29.5140907Z 2022-11-23T04:09:29.5141442Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:09:29.5142373Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:09:29.5147957Z parallel (file granularity) tests: 2022-11-23T04:09:29.5148209Z 2022-11-23T04:09:29.5148458Z serial (file granularity) tests: 2022-11-23T04:09:29.5148789Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T04:09:31.8548599Z Ignoring disabled issues: [] 2022-11-23T04:09:32.2575364Z Running distributed/_shard/sharded_tensor/ops/test_matrix_ops ... [2022-11-23 04:09:32.257009] 2022-11-23T04:09:32.2577497Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:09:32.257500] 2022-11-23T04:10:24.9289020Z 2022-11-23T04:10:24.9290027Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T04:10:24.9291050Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_h1b_fc7v) 2022-11-23T04:10:24.9294060Z 2022-11-23T04:10:24.9294301Z Running tests... 2022-11-23T04:10:24.9294858Z ---------------------------------------------------------------------- 2022-11-23T04:10:24.9295474Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops 2022-11-23T04:10:24.9296042Z test_sharded_tensor_contiguous (__main__.TestShardedTensorMatrixOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:10:24.9296540Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25327 2022-11-23T04:10:24.9296996Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25328 2022-11-23T04:10:24.9299759Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25329 2022-11-23T04:10:24.9300235Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25330 2022-11-23T04:10:24.9300916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9301373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9301965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9302432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9303012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9303668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9304584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9305065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9305646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9306475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9307046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9307537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9308116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9308573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9309125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9309599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9310042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9310539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9311013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9311482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9311955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9312422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9312916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9313408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9314069Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9314851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9315718Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9316401Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9316794Z ok (6.177s) 2022-11-23T04:10:24.9317242Z test_sharded_tensor_layer_norm (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25612 2022-11-23T04:10:24.9317789Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25613 2022-11-23T04:10:24.9318240Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25614 2022-11-23T04:10:24.9318684Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25615 2022-11-23T04:10:24.9319273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9319869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9320424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9320859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9321518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9321946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9322665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9323110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9323682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9324125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9324695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9325139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9326025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9326472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9327026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9327487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9327921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9328394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9328839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9329323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9329815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9330287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9330764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9331243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9332051Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9332750Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9333421Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9334076Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9334456Z ok (4.518s) 2022-11-23T04:10:24.9334886Z test_sharded_tensor_layer_norm_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25901 2022-11-23T04:10:24.9335432Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25902 2022-11-23T04:10:24.9335863Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25903 2022-11-23T04:10:24.9336273Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25904 2022-11-23T04:10:24.9337060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9337509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9338078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9338532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9339175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9339621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9340197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9340645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9341220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9341662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9342211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9342674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9343245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9343840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9344789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9345256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9345693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9346152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9346618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9347099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9347754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9348200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9348665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9349135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9349950Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9350699Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9351388Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9352064Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9352605Z ok (4.418s) 2022-11-23T04:10:24.9353031Z test_sharded_tensor_masked_fill (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26314 2022-11-23T04:10:24.9353559Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26315 2022-11-23T04:10:24.9354083Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26316 2022-11-23T04:10:24.9354490Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26317 2022-11-23T04:10:24.9355075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9355511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9356062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9356497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9357141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9357570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9358100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9358547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9359105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9359530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9360060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9360501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9361059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9361481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9362184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9362643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9363078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9363535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9364016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9364497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9364959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9365430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9365917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9366400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9367034Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9367780Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9368467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9369141Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9369508Z ok (4.518s) 2022-11-23T04:10:24.9369972Z test_sharded_tensor_masked_fill_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26603 2022-11-23T04:10:24.9370525Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26604 2022-11-23T04:10:24.9370974Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26605 2022-11-23T04:10:24.9371397Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26606 2022-11-23T04:10:24.9371998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9372444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9372992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9373458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9374097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9374694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9375223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9375667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9376225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9376651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9377346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9377793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9378364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9378810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9379394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9379857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9380293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9380906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9381552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9382029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9382466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9382951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9383433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9384252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9384906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9385671Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9386361Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9387033Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9387401Z ok (4.317s) 2022-11-23T04:10:24.9388012Z test_sharded_tensor_softmax (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26888 2022-11-23T04:10:24.9388542Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26889 2022-11-23T04:10:24.9388974Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26890 2022-11-23T04:10:24.9389382Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26891 2022-11-23T04:10:24.9389963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9390393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9390926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9391375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9392009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9392433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9392965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9393406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9393958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9394362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9394910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9395348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9395902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9396306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9396850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9397469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9397886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9398358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9398816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9399284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9399744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9400237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9400868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9401333Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9402140Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9402879Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9403569Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9404244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9404614Z ok (4.417s) 2022-11-23T04:10:24.9405069Z test_sharded_tensor_transpose (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27177 2022-11-23T04:10:24.9405772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27178 2022-11-23T04:10:24.9406367Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27179 2022-11-23T04:10:24.9406801Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27180 2022-11-23T04:10:24.9407409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9407856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9408410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9408962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9409680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9410085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9410635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9411078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9411632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9412039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9412762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9413202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9413774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9414220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9414802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9415261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9415682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9416155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9416623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9417087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9417710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9418183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9418651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9419098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9419788Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9420642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9421328Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9421979Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9422371Z ok (4.418s) 2022-11-23T04:10:24.9422838Z test_sharded_tensor_transpose_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27466 2022-11-23T04:10:24.9423547Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27467 2022-11-23T04:10:24.9424329Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27468 2022-11-23T04:10:24.9424780Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27469 2022-11-23T04:10:24.9425391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9425817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9426385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9426956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9427532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9427953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9428524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9428983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9429557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9429979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9430544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9431004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9431561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9432001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9432717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9433158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9433564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9434019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9434487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9434946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9435416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9436045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9436519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9436981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9437705Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9438397Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9439075Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9439731Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9440126Z ok (4.317s) 2022-11-23T04:10:24.9440578Z test_sharded_tensor_type_as (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27751 2022-11-23T04:10:24.9441123Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27752 2022-11-23T04:10:24.9441548Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27753 2022-11-23T04:10:24.9441982Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27754 2022-11-23T04:10:24.9442588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9443021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9460400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9461080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9461698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9462157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9462733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9463204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9463766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9464483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9465066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9465519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9466092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9466534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9467134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9467575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9468017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9468489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9468958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9469439Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9469923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9470398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9470857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9471350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9472112Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9472820Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9473481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9474166Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9474555Z ok (4.317s) 2022-11-23T04:10:24.9475005Z test_sharded_tensor_view (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28036 2022-11-23T04:10:24.9475523Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28037 2022-11-23T04:10:24.9476121Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28038 2022-11-23T04:10:24.9476736Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28039 2022-11-23T04:10:24.9477323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9477773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9478486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9479049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9479610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9480053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9480619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9481071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9481638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9482074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9482635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9483077Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9483654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9484092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9484796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9485218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9485643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9486096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9486548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9487030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9487503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9487950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9488390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9488863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9489553Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9490233Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9490868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9491524Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9491900Z ok (4.518s) 2022-11-23T04:10:24.9492323Z test_sharded_tensor_view_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28325 2022-11-23T04:10:24.9492853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28326 2022-11-23T04:10:24.9493288Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28327 2022-11-23T04:10:24.9493719Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28328 2022-11-23T04:10:24.9494282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9494714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9495268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9495755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9496310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9496736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9497270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9497863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9498431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9498897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9499481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9499927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9500506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:24.9500959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:24.9501511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:24.9501984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:24.9502421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:24.9502888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:24.9503335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:24.9503822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:24.9504579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:24.9505053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:24.9505537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:24.9506005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:24.9506734Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9507409Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9508085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9508763Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:24.9509160Z ok (4.317s) 2022-11-23T04:10:24.9509290Z 2022-11-23T04:10:24.9509561Z ---------------------------------------------------------------------- 2022-11-23T04:10:24.9510046Z Ran 11 tests in 50.253s 2022-11-23T04:10:24.9510203Z 2022-11-23T04:10:24.9510292Z OK 2022-11-23T04:10:24.9510414Z 2022-11-23T04:10:24.9510514Z Generating XML reports... 2022-11-23T04:10:24.9511162Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123040934.xml 2022-11-23T04:10:24.9511553Z 2022-11-23T04:10:24.9512002Z ##[endgroup] 2022-11-23T04:10:24.9512666Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_h1b_fc7v) 2022-11-23T04:10:24.9513140Z 2022-11-23T04:10:25.2680307Z 2022-11-23T04:10:25.2680694Z real 0m58.183s 2022-11-23T04:10:25.2680954Z user 3m1.184s 2022-11-23T04:10:25.2681196Z sys 2m4.086s 2022-11-23T04:10:25.2681719Z + python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T04:10:27.6264050Z Ignoring disabled issues: [] 2022-11-23T04:10:27.6814200Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:10:27.6814856Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:10:27.6815203Z Selected tests: 2022-11-23T04:10:27.6815519Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T04:10:27.6842747Z Prioritized test from test file changes. 2022-11-23T04:10:27.6843841Z reordering tests for PR: 2022-11-23T04:10:27.6844156Z prioritized: [] 2022-11-23T04:10:27.6844640Z the rest: ['distributed/_shard/sharded_tensor/ops/test_softmax'] 2022-11-23T04:10:27.6844896Z 2022-11-23T04:10:27.6845430Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:10:27.6846371Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:10:27.6852888Z parallel (file granularity) tests: 2022-11-23T04:10:27.6853286Z 2022-11-23T04:10:27.6853562Z serial (file granularity) tests: 2022-11-23T04:10:27.6853898Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T04:10:29.9881284Z Ignoring disabled issues: [] 2022-11-23T04:10:30.4155734Z Running distributed/_shard/sharded_tensor/ops/test_softmax ... [2022-11-23 04:10:30.414953] 2022-11-23T04:10:30.4157218Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_softmax.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:10:30.415442] 2022-11-23T04:10:43.2945889Z 2022-11-23T04:10:43.2946682Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T04:10:43.2947758Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_ihtoyuhv) 2022-11-23T04:10:43.2948395Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpayia6bzs 2022-11-23T04:10:43.2949354Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpayia6bzs/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2949691Z 2022-11-23T04:10:43.2949809Z Running tests... 2022-11-23T04:10:43.2950385Z ---------------------------------------------------------------------- 2022-11-23T04:10:43.2950976Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax 2022-11-23T04:10:43.2951514Z test_sharded_softmax_basic (__main__.TestShardedSoftmax) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:10:43.2952002Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28822 2022-11-23T04:10:43.2952455Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28823 2022-11-23T04:10:43.2952881Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28824 2022-11-23T04:10:43.2953321Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28825 2022-11-23T04:10:43.2953953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2954393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2955027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2955524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2956233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2956688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2957247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2957719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2958304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2958724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2959297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2959833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2960413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2960839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2961401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2961859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2962300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp6td26yh 2022-11-23T04:10:43.2962846Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp6td26yh/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2963515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:43.2964000Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu9jpjpgs 2022-11-23T04:10:43.2964501Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu9jpjpgs/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2965179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:43.2965681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_has99ms 2022-11-23T04:10:43.2966212Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_has99ms/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2966718Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplbvnvili 2022-11-23T04:10:43.2967313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplbvnvili/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2967852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:43.2968320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:43.2968788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:43.2969268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:43.2969753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:43.2970219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:43.2970879Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2971710Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2972374Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2973014Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2973646Z ok (6.054s) 2022-11-23T04:10:43.2974098Z test_sharded_softmax_on_sharding_dim (__main__.TestShardedSoftmax) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29111 2022-11-23T04:10:43.2974637Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29112 2022-11-23T04:10:43.2975065Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29113 2022-11-23T04:10:43.2975505Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29114 2022-11-23T04:10:43.2976121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2976556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2977458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2977924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2978504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2978927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2979576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2980044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2980603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2981050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2981619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2982080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2982634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:43.2983081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:43.2983650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:43.2984489Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:43.2985241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiofeslyt 2022-11-23T04:10:43.2985896Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiofeslyt/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2986421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:43.2986891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:43.2987402Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpppruna9u 2022-11-23T04:10:43.2987944Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpppruna9u/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2988629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpicm44ugp 2022-11-23T04:10:43.2989123Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpicm44ugp/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2989631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:43.2990140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:43.2990818Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptowcxktg 2022-11-23T04:10:43.2991337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptowcxktg/_remote_module_non_scriptable.py 2022-11-23T04:10:43.2991840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:43.2992409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:43.2992883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:43.2993529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:43.2994183Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2994852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2995491Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2996146Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:43.2996530Z ok (4.417s) 2022-11-23T04:10:43.2996678Z 2022-11-23T04:10:43.2996948Z ---------------------------------------------------------------------- 2022-11-23T04:10:43.2997257Z Ran 2 tests in 10.472s 2022-11-23T04:10:43.2997415Z 2022-11-23T04:10:43.2997508Z OK 2022-11-23T04:10:43.2997637Z 2022-11-23T04:10:43.2997756Z Generating XML reports... 2022-11-23T04:10:43.2998345Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123041032.xml 2022-11-23T04:10:43.2998703Z 2022-11-23T04:10:43.2999021Z ##[endgroup] 2022-11-23T04:10:43.2999664Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_ihtoyuhv) 2022-11-23T04:10:43.3000043Z 2022-11-23T04:10:43.6415329Z 2022-11-23T04:10:43.6415606Z real 0m18.373s 2022-11-23T04:10:43.6415896Z user 0m44.968s 2022-11-23T04:10:43.6416136Z sys 0m35.149s 2022-11-23T04:10:43.6416657Z + python test/run_test.py --verbose -i distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T04:10:45.9893106Z Ignoring disabled issues: [] 2022-11-23T04:10:46.0438583Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:10:46.0439192Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:10:46.0439528Z Selected tests: 2022-11-23T04:10:46.0439846Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T04:10:46.0465332Z Prioritized test from test file changes. 2022-11-23T04:10:46.0465665Z reordering tests for PR: 2022-11-23T04:10:46.0465963Z prioritized: [] 2022-11-23T04:10:46.0466489Z the rest: ['distributed/_shard/sharded_optim/test_sharded_optim'] 2022-11-23T04:10:46.0466726Z 2022-11-23T04:10:46.0467324Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:10:46.0468259Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:10:46.0474931Z parallel (file granularity) tests: 2022-11-23T04:10:46.0475232Z 2022-11-23T04:10:46.0475491Z serial (file granularity) tests: 2022-11-23T04:10:46.0475815Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T04:10:48.4163716Z Ignoring disabled issues: [] 2022-11-23T04:10:48.8414028Z Running distributed/_shard/sharded_optim/test_sharded_optim ... [2022-11-23 04:10:48.840740] 2022-11-23T04:10:48.8415829Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_optim/test_sharded_optim.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:10:48.841240] 2022-11-23T04:10:57.4152748Z 2022-11-23T04:10:57.4153723Z Expand the folded group to see the log file of distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T04:10:57.4155296Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_n_7r85en) 2022-11-23T04:10:57.4155777Z 2022-11-23T04:10:57.4155908Z Running tests... 2022-11-23T04:10:57.4156481Z ---------------------------------------------------------------------- 2022-11-23T04:10:57.4157111Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim 2022-11-23T04:10:57.4157674Z test_named_params_with_sharded_tensor (__main__.TestShardedOptimizer) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:10:57.4158741Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82023 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.749s) 2022-11-23T04:10:57.4159462Z test_sharded_optim (__main__.TestShardedOptimizer) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29612 2022-11-23T04:10:57.4160005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29613 2022-11-23T04:10:57.4160600Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29614 2022-11-23T04:10:57.4161089Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29615 2022-11-23T04:10:57.4161647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:57.4162200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:57.4162649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:57.4163120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:57.4163698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:57.4164147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:57.4164688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:57.4165149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:57.4165793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:57.4166255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:57.4166913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:57.4167308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:57.4167968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:10:57.4168476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:10:57.4169145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:10:57.4169537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:10:57.4169965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:10:57.4170501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:10:57.4170961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:10:57.4171721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:10:57.4172078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:10:57.4172714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:10:57.4173105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:10:57.4173584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:10:57.4174206Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:57.4174887Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:57.4175564Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:57.4176309Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:10:57.4176608Z ok (4.420s) 2022-11-23T04:10:57.4176763Z 2022-11-23T04:10:57.4177031Z ---------------------------------------------------------------------- 2022-11-23T04:10:57.4177431Z Ran 2 tests in 6.169s 2022-11-23T04:10:57.4177609Z 2022-11-23T04:10:57.4177624Z OK (skipped=1) 2022-11-23T04:10:57.4177783Z 2022-11-23T04:10:57.4177910Z Generating XML reports... 2022-11-23T04:10:57.4178784Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123041050.xml 2022-11-23T04:10:57.4179113Z 2022-11-23T04:10:57.4179414Z ##[endgroup] 2022-11-23T04:10:57.4180098Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_n_7r85en) 2022-11-23T04:10:57.4180500Z 2022-11-23T04:10:57.8075426Z 2022-11-23T04:10:57.8075960Z real 0m14.166s 2022-11-23T04:10:57.8076435Z user 0m30.236s 2022-11-23T04:10:57.8076685Z sys 0m25.047s 2022-11-23T04:10:57.8077281Z + python test/run_test.py --verbose -i distributed/_shard/test_partial_tensor 2022-11-23T04:11:00.1863059Z Ignoring disabled issues: [] 2022-11-23T04:11:00.2402080Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:11:00.2402659Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:11:00.2403015Z Selected tests: 2022-11-23T04:11:00.2403288Z distributed/_shard/test_partial_tensor 2022-11-23T04:11:00.2429232Z Prioritized test from test file changes. 2022-11-23T04:11:00.2430004Z reordering tests for PR: 2022-11-23T04:11:00.2430288Z prioritized: [] 2022-11-23T04:11:00.2430849Z the rest: ['distributed/_shard/test_partial_tensor'] 2022-11-23T04:11:00.2431074Z 2022-11-23T04:11:00.2431569Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:11:00.2432510Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:11:00.2437307Z parallel (file granularity) tests: 2022-11-23T04:11:00.2437732Z 2022-11-23T04:11:00.2437994Z serial (file granularity) tests: 2022-11-23T04:11:00.2438342Z distributed/_shard/test_partial_tensor 2022-11-23T04:11:02.5456214Z Ignoring disabled issues: [] 2022-11-23T04:11:02.9089342Z Running distributed/_shard/test_partial_tensor ... [2022-11-23 04:11:02.908305] 2022-11-23T04:11:02.9091202Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_partial_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:11:02.908770] 2022-11-23T04:11:28.9316533Z 2022-11-23T04:11:28.9317185Z Expand the folded group to see the log file of distributed/_shard/test_partial_tensor 2022-11-23T04:11:28.9321423Z ##[group]PRINTING LOG FILE of distributed/_shard/test_partial_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_partial_tensor_1nym_t0h) 2022-11-23T04:11:28.9322101Z 2022-11-23T04:11:28.9322200Z Running tests... 2022-11-23T04:11:28.9322761Z ---------------------------------------------------------------------- 2022-11-23T04:11:28.9323332Z Test results will be stored in test-reports/python-unittest/distributed._shard.test_partial_tensor 2022-11-23T04:11:28.9323835Z test_cat (__main__.TestPartialTensorOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T04:11:28.9324288Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30125 2022-11-23T04:11:28.9324743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30126 2022-11-23T04:11:28.9325200Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30127 2022-11-23T04:11:28.9325631Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30128 2022-11-23T04:11:28.9326260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9326721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9327296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9328534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9329130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9329578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9330163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9330612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9331187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9331634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9332203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9332653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9333223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9333777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9334347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9334813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9335250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:11:28.9335728Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:11:28.9336175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:11:28.9337075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:11:28.9337572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:11:28.9338080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:11:28.9338559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:11:28.9339038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:11:28.9339701Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9340474Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9341136Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9341807Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9342190Z ok (6.227s) 2022-11-23T04:11:28.9342604Z test_cat_errors (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30410 2022-11-23T04:11:28.9343131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30411 2022-11-23T04:11:28.9343577Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30412 2022-11-23T04:11:28.9344485Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30413 2022-11-23T04:11:28.9345097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9345547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9346117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9346583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9347141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9347585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9348148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9348592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9349164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9349604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9350171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9350614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9351184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9351719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9352283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9352834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9353257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:11:28.9353733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:11:28.9354192Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:11:28.9354676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:11:28.9355138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:11:28.9355617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:11:28.9356108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:11:28.9356590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:11:28.9357223Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9357994Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9358670Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9359342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9359709Z ok (4.318s) 2022-11-23T04:11:28.9360137Z test_transpose (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30695 2022-11-23T04:11:28.9360657Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30696 2022-11-23T04:11:28.9361084Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30697 2022-11-23T04:11:28.9361516Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30698 2022-11-23T04:11:28.9362113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9362563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9363111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9363573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9364144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9364571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9365136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9365599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9366169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9366599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9367162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9367621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9368173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9368666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9369237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9369697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9370119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:11:28.9370590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:11:28.9371075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:11:28.9371552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:11:28.9371994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:11:28.9372460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:11:28.9372943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:11:28.9373408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:11:28.9374058Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9374798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9375473Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9376121Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9376503Z ok (4.317s) 2022-11-23T04:11:28.9376957Z test_partial_tensor_reshard (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30980 2022-11-23T04:11:28.9377500Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30981 2022-11-23T04:11:28.9377927Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30982 2022-11-23T04:11:28.9378359Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30983 2022-11-23T04:11:28.9378961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9379395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9379968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9380427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9380999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9381423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9381988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9382446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9383014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9383444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9384218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9384685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9385243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9385767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9386344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9386803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9387214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:11:28.9387691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:11:28.9388155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:11:28.9388600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:11:28.9389083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:11:28.9389570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:11:28.9390056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:11:28.9390521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:11:28.9391165Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9391983Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9392664Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9393325Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9393851Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T04:11:28.9394342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T04:11:28.9394824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T04:11:28.9395285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T04:11:28.9395923Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:11:28.9396453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 2 2022-11-23T04:11:28.9397078Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:11:28.9397754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:11:28.9398425Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T04:11:28.9398950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 3 2022-11-23T04:11:28.9399523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T04:11:28.9400002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T04:11:28.9400648Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:11:28.9401321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:11:28.9401971Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:11:28.9402696Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:3 with 4 nodes. 2022-11-23T04:11:28.9403084Z ok (4.418s) 2022-11-23T04:11:28.9403537Z test_partial_tensor_reshard_errors (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31283 2022-11-23T04:11:28.9404066Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31284 2022-11-23T04:11:28.9404515Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31285 2022-11-23T04:11:28.9404955Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31286 2022-11-23T04:11:28.9405545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9405988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9406552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9407015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9407566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9408009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9408573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9409086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9409641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9410081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9410642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9411084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9411655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:28.9412093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:28.9412651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:28.9413094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:28.9413525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T04:11:28.9413994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T04:11:28.9414440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T04:11:28.9414902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T04:11:28.9415383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T04:11:28.9415873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T04:11:28.9416340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T04:11:28.9416821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T04:11:28.9417468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9418147Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9418805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9419532Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T04:11:28.9419927Z ok (4.318s) 2022-11-23T04:11:28.9420073Z 2022-11-23T04:11:28.9420327Z ---------------------------------------------------------------------- 2022-11-23T04:11:28.9420660Z Ran 5 tests in 23.599s 2022-11-23T04:11:28.9420821Z 2022-11-23T04:11:28.9420915Z OK 2022-11-23T04:11:28.9421045Z 2022-11-23T04:11:28.9421173Z Generating XML reports... 2022-11-23T04:11:28.9421768Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20221123041104.xml 2022-11-23T04:11:28.9422574Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20221123041104.xml 2022-11-23T04:11:28.9422940Z 2022-11-23T04:11:28.9423309Z ##[endgroup] 2022-11-23T04:11:28.9424181Z FINISHED PRINTING LOG FILE of distributed/_shard/test_partial_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_partial_tensor_1nym_t0h) 2022-11-23T04:11:28.9424635Z 2022-11-23T04:11:29.3274304Z 2022-11-23T04:11:29.3274746Z real 0m31.520s 2022-11-23T04:11:29.3275014Z user 1m30.307s 2022-11-23T04:11:29.3275274Z sys 1m4.178s 2022-11-23T04:11:29.3275890Z + python test/run_test.py --verbose -i distributed/_shard/test_replicated_tensor 2022-11-23T04:11:31.7438462Z Ignoring disabled issues: [] 2022-11-23T04:11:31.7985354Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:11:31.7986314Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:11:31.7986678Z Selected tests: 2022-11-23T04:11:31.7986953Z distributed/_shard/test_replicated_tensor 2022-11-23T04:11:31.8012955Z Prioritized test from test file changes. 2022-11-23T04:11:31.8013335Z reordering tests for PR: 2022-11-23T04:11:31.8013803Z prioritized: [] 2022-11-23T04:11:31.8014506Z the rest: ['distributed/_shard/test_replicated_tensor'] 2022-11-23T04:11:31.8014787Z 2022-11-23T04:11:31.8015246Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:11:31.8016196Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:11:31.8021257Z parallel (file granularity) tests: 2022-11-23T04:11:31.8022037Z 2022-11-23T04:11:31.8022277Z serial (file granularity) tests: 2022-11-23T04:11:31.8022612Z distributed/_shard/test_replicated_tensor 2022-11-23T04:11:34.0767876Z Ignoring disabled issues: [] 2022-11-23T04:11:34.4416754Z Running distributed/_shard/test_replicated_tensor ... [2022-11-23 04:11:34.441037] 2022-11-23T04:11:34.4417894Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_replicated_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:11:34.441534] 2022-11-23T04:11:36.6666504Z 2022-11-23T04:11:36.6667209Z Expand the folded group to see the log file of distributed/_shard/test_replicated_tensor 2022-11-23T04:11:36.6668409Z ##[group]PRINTING LOG FILE of distributed/_shard/test_replicated_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_replicated_tensor_c6vpqb1_) 2022-11-23T04:11:36.6668795Z 2022-11-23T04:11:36.6669114Z ##[endgroup] 2022-11-23T04:11:36.6669924Z FINISHED PRINTING LOG FILE of distributed/_shard/test_replicated_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_replicated_tensor_c6vpqb1_) 2022-11-23T04:11:36.6670301Z 2022-11-23T04:11:37.0485406Z 2022-11-23T04:11:37.0486699Z real 0m7.721s 2022-11-23T04:11:37.0487003Z user 0m12.425s 2022-11-23T04:11:37.0487247Z sys 0m11.731s 2022-11-23T04:11:37.0487784Z + python test/run_test.py --verbose -i test_cuda_primary_ctx 2022-11-23T04:11:39.4343558Z Ignoring disabled issues: [] 2022-11-23T04:11:39.4901571Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T04:11:39.4902240Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T04:11:39.4903031Z Selected tests: 2022-11-23T04:11:39.4903327Z test_cuda_primary_ctx 2022-11-23T04:11:39.4929286Z Prioritized test from test file changes. 2022-11-23T04:11:39.4929665Z reordering tests for PR: 2022-11-23T04:11:39.4930007Z prioritized: [] 2022-11-23T04:11:39.4930501Z the rest: ['test_cuda_primary_ctx'] 2022-11-23T04:11:39.4930699Z 2022-11-23T04:11:39.4931264Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T04:11:39.4932132Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T04:11:39.4938677Z parallel (file granularity) tests: 2022-11-23T04:11:39.4938997Z 2022-11-23T04:11:39.4939263Z serial (file granularity) tests: 2022-11-23T04:11:39.4939533Z test_cuda_primary_ctx 2022-11-23T04:11:41.7943775Z Ignoring disabled issues: [] 2022-11-23T04:11:42.2134483Z Running test_cuda_primary_ctx ... [2022-11-23 04:11:42.212864] 2022-11-23T04:11:42.2135411Z Executing ['/opt/conda/bin/python', '-bb', 'test_cuda_primary_ctx.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 04:11:42.213276] 2022-11-23T04:11:59.3072905Z 2022-11-23T04:11:59.3073596Z Expand the folded group to see the log file of test_cuda_primary_ctx 2022-11-23T04:11:59.3074600Z ##[group]PRINTING LOG FILE of test_cuda_primary_ctx (/var/lib/jenkins/workspace/test/test-reports/test_cuda_primary_ctx_waqqil_t) 2022-11-23T04:11:59.3075107Z 2022-11-23T04:11:59.3075732Z , <__main__.TestCudaPrimaryCtx testMethod=test_pin_memory>, <__main__.TestCudaPrimaryCtx testMethod=test_str_repr>]> 2022-11-23T04:11:59.3076187Z test_copy (__main__.TestCudaPrimaryCtx) 2022-11-23T04:11:59.3076598Z test_pin_memory (__main__.TestCudaPrimaryCtx) 2022-11-23T04:11:59.3076923Z test_str_repr (__main__.TestCudaPrimaryCtx) 2022-11-23T04:11:59.3077588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:59.3078068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:59.3078661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:59.3079151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:59.3079388Z 2022-11-23T04:11:59.3079482Z Running tests... 2022-11-23T04:11:59.3079905Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3080434Z Test results will be stored in test-reports/python-unittest/test_cuda_primary_ctx 2022-11-23T04:11:59.3080851Z test_copy (__main__.TestCudaPrimaryCtx) ... ok (1.397s) 2022-11-23T04:11:59.3081095Z 2022-11-23T04:11:59.3081350Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3081690Z Ran 1 test in 2.491s 2022-11-23T04:11:59.3081858Z 2022-11-23T04:11:59.3081956Z OK 2022-11-23T04:11:59.3082096Z 2022-11-23T04:11:59.3082227Z Generating XML reports... 2022-11-23T04:11:59.3082780Z Generated XML report: test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041146.xml 2022-11-23T04:11:59.3083471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:59.3083931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:59.3084487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:59.3085269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:59.3085547Z 2022-11-23T04:11:59.3085668Z Running tests... 2022-11-23T04:11:59.3086175Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3086644Z Test results will be stored in test-reports/python-unittest/test_cuda_primary_ctx 2022-11-23T04:11:59.3087105Z test_pin_memory (__main__.TestCudaPrimaryCtx) ... ok (1.412s) 2022-11-23T04:11:59.3087340Z 2022-11-23T04:11:59.3087625Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3087956Z Ran 1 test in 2.503s 2022-11-23T04:11:59.3088127Z 2022-11-23T04:11:59.3088218Z OK 2022-11-23T04:11:59.3088355Z 2022-11-23T04:11:59.3088482Z Generating XML reports... 2022-11-23T04:11:59.3089072Z Generated XML report: test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041151.xml 2022-11-23T04:11:59.3089779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T04:11:59.3090259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T04:11:59.3090931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T04:11:59.3091340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T04:11:59.3091721Z 2022-11-23T04:11:59.3091831Z Running tests... 2022-11-23T04:11:59.3092258Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3092801Z Test results will be stored in test-reports/python-unittest/test_cuda_primary_ctx 2022-11-23T04:11:59.3093224Z test_str_repr (__main__.TestCudaPrimaryCtx) ... ok (1.352s) 2022-11-23T04:11:59.3093450Z 2022-11-23T04:11:59.3093726Z ---------------------------------------------------------------------- 2022-11-23T04:11:59.3094066Z Ran 1 test in 2.430s 2022-11-23T04:11:59.3094234Z 2022-11-23T04:11:59.3094308Z OK 2022-11-23T04:11:59.3094456Z 2022-11-23T04:11:59.3094665Z Generating XML reports... 2022-11-23T04:11:59.3095193Z Generated XML report: test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041156.xml 2022-11-23T04:11:59.3095548Z 2022-11-23T04:11:59.3095861Z ##[endgroup] 2022-11-23T04:11:59.3096428Z FINISHED PRINTING LOG FILE of test_cuda_primary_ctx (/var/lib/jenkins/workspace/test/test-reports/test_cuda_primary_ctx_waqqil_t) 2022-11-23T04:11:59.3096773Z 2022-11-23T04:11:59.6587123Z 2022-11-23T04:11:59.6587675Z real 0m22.610s 2022-11-23T04:11:59.6587993Z user 0m32.334s 2022-11-23T04:11:59.6588239Z sys 0m30.000s 2022-11-23T04:11:59.6588473Z + assert_git_not_dirty 2022-11-23T04:11:59.6589053Z + [[ linux-bionic-cuda11.6-py3.9-gcc7 != *rocm* ]] 2022-11-23T04:11:59.6589485Z + [[ linux-bionic-cuda11.6-py3.9-gcc7 != *xla* ]] 2022-11-23T04:11:59.6595494Z ++ git status --porcelain 2022-11-23T04:12:01.3458182Z + git_status= 2022-11-23T04:12:01.3458666Z + [[ -n '' ]] 2022-11-23T04:12:01.3537543Z Prepare all required actions 2022-11-23T04:12:01.3537980Z Getting action download info 2022-11-23T04:12:01.5328588Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2022-11-23T04:12:01.7553040Z ##[group]Run ./.github/actions/get-workflow-job-id 2022-11-23T04:12:01.7553328Z with: 2022-11-23T04:12:01.7553812Z github-token: *** 2022-11-23T04:12:01.7554056Z env: 2022-11-23T04:12:01.7554276Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:01.7554545Z GPU_FLAG: --gpus all 2022-11-23T04:12:01.7554995Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:01.7555327Z ##[endgroup] 2022-11-23T04:12:01.7590479Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2022-11-23T04:12:01.7590796Z with: 2022-11-23T04:12:01.7591014Z shell: bash 2022-11-23T04:12:01.7591239Z timeout_minutes: 10 2022-11-23T04:12:01.7591480Z max_attempts: 5 2022-11-23T04:12:01.7591769Z retry_wait_seconds: 30 2022-11-23T04:12:01.7592268Z command: set -eux python3 -m pip install requests==2.26.0 GHA_WORKFLOW_JOB_ID=$(python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}") echo "job-id=${GHA_WORKFLOW_JOB_ID}" >> "${GITHUB_OUTPUT}" 2022-11-23T04:12:01.7592775Z polling_interval_seconds: 1 2022-11-23T04:12:01.7593047Z warning_on_retry: true 2022-11-23T04:12:01.7593309Z continue_on_error: false 2022-11-23T04:12:01.7593532Z env: 2022-11-23T04:12:01.7593768Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:01.7594092Z GPU_FLAG: --gpus all 2022-11-23T04:12:01.7594437Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:01.7594924Z GITHUB_TOKEN: *** 2022-11-23T04:12:01.7595169Z ##[endgroup] 2022-11-23T04:12:01.8311768Z + python3 -m pip install requests==2.26.0 2022-11-23T04:12:02.1317529Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T04:12:02.2797094Z Collecting requests==2.26.0 2022-11-23T04:12:02.2967796Z Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB) 2022-11-23T04:12:02.3853772Z Collecting certifi>=2017.4.17 2022-11-23T04:12:02.3922067Z Downloading certifi-2022.9.24-py3-none-any.whl (161 kB) 2022-11-23T04:12:02.5815402Z Collecting charset-normalizer~=2.0.0; python_version >= "3" 2022-11-23T04:12:02.5923936Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2022-11-23T04:12:02.7094264Z Collecting urllib3<1.27,>=1.21.1 2022-11-23T04:12:02.7137386Z Downloading urllib3-1.26.12-py2.py3-none-any.whl (140 kB) 2022-11-23T04:12:02.7957845Z Collecting idna<4,>=2.5; python_version >= "3" 2022-11-23T04:12:02.7997938Z Downloading idna-3.4-py3-none-any.whl (61 kB) 2022-11-23T04:12:02.8951006Z Installing collected packages: certifi, charset-normalizer, urllib3, idna, requests 2022-11-23T04:12:02.9404729Z WARNING: The script normalizer is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T04:12:02.9405408Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T04:12:03.1644243Z Successfully installed certifi-2022.9.24 charset-normalizer-2.0.12 idna-3.4 requests-2.26.0 urllib3-1.26.12 2022-11-23T04:12:03.2170834Z ++ python3 .github/scripts/get_workflow_job_id.py 3528394938 i-08a957f819e89e94d 2022-11-23T04:12:05.0425638Z + GHA_WORKFLOW_JOB_ID=9655554887 2022-11-23T04:12:05.0426439Z + echo job-id=9655554887 2022-11-23T04:12:05.8309432Z Command completed after 1 attempt(s). 2022-11-23T04:12:05.8463738Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2022-11-23T04:12:05.8464532Z kill "$MONITOR_SCRIPT_PID" 2022-11-23T04:12:05.8479362Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:05.8479666Z env: 2022-11-23T04:12:05.8479916Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:05.8480163Z GPU_FLAG: --gpus all 2022-11-23T04:12:05.8480526Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:05.8480894Z MONITOR_SCRIPT_PID: 92140 2022-11-23T04:12:05.8481134Z ##[endgroup] 2022-11-23T04:12:05.8589656Z Prepare all required actions 2022-11-23T04:12:05.8590038Z Getting action download info 2022-11-23T04:12:06.0262517Z Download action repository 'actions/upload-artifact@v3' (SHA:83fd05a356d7e2593de66fc9913b3002723633cb) 2022-11-23T04:12:06.2028891Z ##[group]Run ./.github/actions/upload-test-artifacts 2022-11-23T04:12:06.2029218Z with: 2022-11-23T04:12:06.2029564Z file-suffix: test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887 2022-11-23T04:12:06.2029902Z env: 2022-11-23T04:12:06.2030126Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:06.2030393Z GPU_FLAG: --gpus all 2022-11-23T04:12:06.2030757Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:06.2031089Z ##[endgroup] 2022-11-23T04:12:06.2060705Z ##[group]Run # Remove any previous test jsons if they exist 2022-11-23T04:12:06.2061065Z # Remove any previous test jsons if they exist 2022-11-23T04:12:06.2061381Z rm -f test-jsons-*.zip 2022-11-23T04:12:06.2061753Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2022-11-23T04:12:06.2074987Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:06.2075291Z env: 2022-11-23T04:12:06.2075533Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:06.2075785Z GPU_FLAG: --gpus all 2022-11-23T04:12:06.2076159Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:06.2076624Z FILE_SUFFIX: test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887 2022-11-23T04:12:06.2076960Z ##[endgroup] 2022-11-23T04:12:06.2271497Z adding: test/allowlist_for_publicAPI.json (deflated 79%) 2022-11-23T04:12:06.2308998Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2022-11-23T04:12:06.2316388Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2022-11-23T04:12:06.2316910Z adding: test/.pytorch-slow-tests.json (deflated 73%) 2022-11-23T04:12:06.2328650Z adding: test/.pytorch-disabled-tests.json (deflated 86%) 2022-11-23T04:12:06.2352013Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T04:12:06.2352400Z # Remove any previous test reports if they exist 2022-11-23T04:12:06.2352725Z rm -f test-reports-*.zip 2022-11-23T04:12:06.2353087Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' -i '*.csv' 2022-11-23T04:12:06.2364888Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:06.2365187Z env: 2022-11-23T04:12:06.2365428Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:06.2365677Z GPU_FLAG: --gpus all 2022-11-23T04:12:06.2366046Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:06.2366501Z FILE_SUFFIX: test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887 2022-11-23T04:12:06.2366846Z ##[endgroup] 2022-11-23T04:12:06.2502929Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-CommTest-20221123014921.xml (deflated 37%) 2022-11-23T04:12:06.2503825Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014927.xml (deflated 41%) 2022-11-23T04:12:06.2505074Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014932.xml (deflated 40%) 2022-11-23T04:12:06.2505897Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014936.xml (deflated 40%) 2022-11-23T04:12:06.2506681Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123014940.xml (deflated 42%) 2022-11-23T04:12:06.2507510Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123014944.xml (deflated 41%) 2022-11-23T04:12:06.2508369Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123014950.xml (deflated 41%) 2022-11-23T04:12:06.2509384Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123015000.xml (deflated 41%) 2022-11-23T04:12:06.2510239Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123015007.xml (deflated 41%) 2022-11-23T04:12:06.2511010Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015016.xml (deflated 39%) 2022-11-23T04:12:06.2511707Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015020.xml (deflated 39%) 2022-11-23T04:12:06.2512389Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123015024.xml (deflated 39%) 2022-11-23T04:12:06.2513062Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015036.xml (deflated 38%) 2022-11-23T04:12:06.2513730Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015043.xml (deflated 38%) 2022-11-23T04:12:06.2514395Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015051.xml (deflated 38%) 2022-11-23T04:12:06.2515074Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015058.xml (deflated 38%) 2022-11-23T04:12:06.2515720Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015105.xml (deflated 38%) 2022-11-23T04:12:06.2516381Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015113.xml (deflated 38%) 2022-11-23T04:12:06.2517042Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015121.xml (deflated 38%) 2022-11-23T04:12:06.2517706Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015128.xml (deflated 38%) 2022-11-23T04:12:06.2518347Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015134.xml (deflated 38%) 2022-11-23T04:12:06.2519009Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015141.xml (deflated 37%) 2022-11-23T04:12:06.2519666Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123015148.xml (deflated 37%) 2022-11-23T04:12:06.2520448Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015154.xml (deflated 38%) 2022-11-23T04:12:06.2521117Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015201.xml (deflated 38%) 2022-11-23T04:12:06.2521800Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015209.xml (deflated 38%) 2022-11-23T04:12:06.2522483Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015216.xml (deflated 38%) 2022-11-23T04:12:06.2523162Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015224.xml (deflated 38%) 2022-11-23T04:12:06.2523817Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015231.xml (deflated 38%) 2022-11-23T04:12:06.2524502Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015239.xml (deflated 39%) 2022-11-23T04:12:06.2525210Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015246.xml (deflated 38%) 2022-11-23T04:12:06.2525888Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015254.xml (deflated 38%) 2022-11-23T04:12:06.2526548Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015300.xml (deflated 38%) 2022-11-23T04:12:06.2527223Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123015307.xml (deflated 38%) 2022-11-23T04:12:06.2528026Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015315.xml (deflated 45%) 2022-11-23T04:12:06.2528898Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015324.xml (deflated 44%) 2022-11-23T04:12:06.2529708Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015333.xml (deflated 43%) 2022-11-23T04:12:06.2530482Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015342.xml (deflated 43%) 2022-11-23T04:12:06.2531259Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015350.xml (deflated 45%) 2022-11-23T04:12:06.2532050Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015359.xml (deflated 45%) 2022-11-23T04:12:06.2532845Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015408.xml (deflated 46%) 2022-11-23T04:12:06.2533632Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015417.xml (deflated 46%) 2022-11-23T04:12:06.2534424Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015425.xml (deflated 44%) 2022-11-23T04:12:06.2535226Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015434.xml (deflated 45%) 2022-11-23T04:12:06.2536022Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015443.xml (deflated 46%) 2022-11-23T04:12:06.2536802Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015452.xml (deflated 44%) 2022-11-23T04:12:06.2537597Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015500.xml (deflated 44%) 2022-11-23T04:12:06.2538376Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015509.xml (deflated 43%) 2022-11-23T04:12:06.2539172Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015516.xml (deflated 44%) 2022-11-23T04:12:06.2540009Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015524.xml (deflated 45%) 2022-11-23T04:12:06.2540796Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015530.xml (deflated 44%) 2022-11-23T04:12:06.2541587Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015537.xml (deflated 45%) 2022-11-23T04:12:06.2542413Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015543.xml (deflated 45%) 2022-11-23T04:12:06.2543200Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015550.xml (deflated 50%) 2022-11-23T04:12:06.2544267Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015558.xml (deflated 42%) 2022-11-23T04:12:06.2545085Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015606.xml (deflated 41%) 2022-11-23T04:12:06.2545859Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015615.xml (deflated 42%) 2022-11-23T04:12:06.2546646Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015623.xml (deflated 41%) 2022-11-23T04:12:06.2547500Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015632.xml (deflated 42%) 2022-11-23T04:12:06.2548294Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015640.xml (deflated 41%) 2022-11-23T04:12:06.2549083Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015651.xml (deflated 42%) 2022-11-23T04:12:06.2549875Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015657.xml (deflated 42%) 2022-11-23T04:12:06.2550642Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015704.xml (deflated 41%) 2022-11-23T04:12:06.2551425Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015710.xml (deflated 44%) 2022-11-23T04:12:06.2552213Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015717.xml (deflated 45%) 2022-11-23T04:12:06.2552997Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015724.xml (deflated 41%) 2022-11-23T04:12:06.2553758Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015732.xml (deflated 41%) 2022-11-23T04:12:06.2554538Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015738.xml (deflated 41%) 2022-11-23T04:12:06.2555321Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015748.xml (deflated 41%) 2022-11-23T04:12:06.2556109Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015755.xml (deflated 42%) 2022-11-23T04:12:06.2556880Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015801.xml (deflated 41%) 2022-11-23T04:12:06.2557662Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123015811.xml (deflated 41%) 2022-11-23T04:12:06.2558553Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015820.xml (deflated 42%) 2022-11-23T04:12:06.2559610Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015826.xml (deflated 42%) 2022-11-23T04:12:06.2560551Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015833.xml (deflated 43%) 2022-11-23T04:12:06.2561513Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123015840.xml (deflated 42%) 2022-11-23T04:12:06.2562368Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015846.xml (deflated 39%) 2022-11-23T04:12:06.2563117Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015853.xml (deflated 39%) 2022-11-23T04:12:06.2563881Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020447.xml (deflated 39%) 2022-11-23T04:12:06.2564590Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015901.xml (deflated 39%) 2022-11-23T04:12:06.2565340Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015908.xml (deflated 40%) 2022-11-23T04:12:06.2566083Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015915.xml (deflated 39%) 2022-11-23T04:12:06.2566832Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015921.xml (deflated 39%) 2022-11-23T04:12:06.2567614Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015928.xml (deflated 39%) 2022-11-23T04:12:06.2568369Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015935.xml (deflated 39%) 2022-11-23T04:12:06.2569113Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015945.xml (deflated 40%) 2022-11-23T04:12:06.2569851Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123015952.xml (deflated 39%) 2022-11-23T04:12:06.2570564Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020000.xml (deflated 39%) 2022-11-23T04:12:06.2571307Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020009.xml (deflated 40%) 2022-11-23T04:12:06.2572043Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020016.xml (deflated 40%) 2022-11-23T04:12:06.2572787Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020022.xml (deflated 40%) 2022-11-23T04:12:06.2573502Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020029.xml (deflated 40%) 2022-11-23T04:12:06.2574246Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020036.xml (deflated 39%) 2022-11-23T04:12:06.2574985Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020043.xml (deflated 39%) 2022-11-23T04:12:06.2575778Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020051.xml (deflated 40%) 2022-11-23T04:12:06.2576522Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020058.xml (deflated 40%) 2022-11-23T04:12:06.2577247Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020105.xml (deflated 40%) 2022-11-23T04:12:06.2577991Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020114.xml (deflated 40%) 2022-11-23T04:12:06.2578791Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020121.xml (deflated 40%) 2022-11-23T04:12:06.2579525Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020128.xml (deflated 40%) 2022-11-23T04:12:06.2580239Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020136.xml (deflated 40%) 2022-11-23T04:12:06.2580980Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020143.xml (deflated 40%) 2022-11-23T04:12:06.2581723Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020150.xml (deflated 40%) 2022-11-23T04:12:06.2582460Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020158.xml (deflated 40%) 2022-11-23T04:12:06.2583184Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020205.xml (deflated 40%) 2022-11-23T04:12:06.2584195Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020212.xml (deflated 40%) 2022-11-23T04:12:06.2584955Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020220.xml (deflated 40%) 2022-11-23T04:12:06.2585692Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020227.xml (deflated 40%) 2022-11-23T04:12:06.2586491Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020234.xml (deflated 40%) 2022-11-23T04:12:06.2587240Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020241.xml (deflated 40%) 2022-11-23T04:12:06.2587987Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020252.xml (deflated 40%) 2022-11-23T04:12:06.2588736Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020258.xml (deflated 39%) 2022-11-23T04:12:06.2589454Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020305.xml (deflated 40%) 2022-11-23T04:12:06.2590187Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020314.xml (deflated 40%) 2022-11-23T04:12:06.2590921Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020320.xml (deflated 40%) 2022-11-23T04:12:06.2591656Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020327.xml (deflated 39%) 2022-11-23T04:12:06.2592375Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020337.xml (deflated 40%) 2022-11-23T04:12:06.2593119Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020343.xml (deflated 40%) 2022-11-23T04:12:06.2593849Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020350.xml (deflated 40%) 2022-11-23T04:12:06.2594577Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020357.xml (deflated 39%) 2022-11-23T04:12:06.2595294Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020406.xml (deflated 39%) 2022-11-23T04:12:06.2596038Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020413.xml (deflated 40%) 2022-11-23T04:12:06.2596788Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020420.xml (deflated 41%) 2022-11-23T04:12:06.2597514Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020422.xml (deflated 40%) 2022-11-23T04:12:06.2598327Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020429.xml (deflated 41%) 2022-11-23T04:12:06.2599064Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020431.xml (deflated 40%) 2022-11-23T04:12:06.2599802Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123020440.xml (deflated 40%) 2022-11-23T04:12:06.2600513Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020449.xml (deflated 39%) 2022-11-23T04:12:06.2601181Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020451.xml (deflated 39%) 2022-11-23T04:12:06.2601858Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020454.xml (deflated 39%) 2022-11-23T04:12:06.2602540Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020456.xml (deflated 38%) 2022-11-23T04:12:06.2603215Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123020458.xml (deflated 39%) 2022-11-23T04:12:06.2603906Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123020501.xml (deflated 39%) 2022-11-23T04:12:06.2604602Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123020505.xml (deflated 41%) 2022-11-23T04:12:06.2605341Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020517.xml (deflated 38%) 2022-11-23T04:12:06.2606011Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020526.xml (deflated 38%) 2022-11-23T04:12:06.2606653Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020536.xml (deflated 38%) 2022-11-23T04:12:06.2607317Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020545.xml (deflated 38%) 2022-11-23T04:12:06.2607970Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020553.xml (deflated 38%) 2022-11-23T04:12:06.2608626Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020600.xml (deflated 38%) 2022-11-23T04:12:06.2609264Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020616.xml (deflated 38%) 2022-11-23T04:12:06.2609927Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020628.xml (deflated 39%) 2022-11-23T04:12:06.2610575Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020640.xml (deflated 37%) 2022-11-23T04:12:06.2611231Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020648.xml (deflated 37%) 2022-11-23T04:12:06.2611875Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020656.xml (deflated 37%) 2022-11-23T04:12:06.2612535Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020704.xml (deflated 37%) 2022-11-23T04:12:06.2613187Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020711.xml (deflated 38%) 2022-11-23T04:12:06.2613840Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020720.xml (deflated 38%) 2022-11-23T04:12:06.2614480Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020728.xml (deflated 38%) 2022-11-23T04:12:06.2615131Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020735.xml (deflated 38%) 2022-11-23T04:12:06.2615781Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020743.xml (deflated 38%) 2022-11-23T04:12:06.2616486Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020751.xml (deflated 38%) 2022-11-23T04:12:06.2617136Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20221123020801.xml (deflated 38%) 2022-11-23T04:12:06.2617808Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020809.xml (deflated 38%) 2022-11-23T04:12:06.2618496Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020818.xml (deflated 38%) 2022-11-23T04:12:06.2619165Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020828.xml (deflated 38%) 2022-11-23T04:12:06.2619864Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020837.xml (deflated 38%) 2022-11-23T04:12:06.2620543Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020847.xml (deflated 38%) 2022-11-23T04:12:06.2621225Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020856.xml (deflated 38%) 2022-11-23T04:12:06.2621879Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CompilerTest-20221123020905.xml (deflated 38%) 2022-11-23T04:12:06.2622621Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020915.xml (deflated 41%) 2022-11-23T04:12:06.2623483Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020925.xml (deflated 41%) 2022-11-23T04:12:06.2624568Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020935.xml (deflated 41%) 2022-11-23T04:12:06.2625349Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020945.xml (deflated 41%) 2022-11-23T04:12:06.2626144Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123020955.xml (deflated 41%) 2022-11-23T04:12:06.2626927Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021004.xml (deflated 42%) 2022-11-23T04:12:06.2627723Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021013.xml (deflated 41%) 2022-11-23T04:12:06.2628513Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021023.xml (deflated 41%) 2022-11-23T04:12:06.2629344Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021032.xml (deflated 41%) 2022-11-23T04:12:06.2630135Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021042.xml (deflated 45%) 2022-11-23T04:12:06.2630928Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021050.xml (deflated 45%) 2022-11-23T04:12:06.2631718Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021059.xml (deflated 43%) 2022-11-23T04:12:06.2632478Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021108.xml (deflated 43%) 2022-11-23T04:12:06.2633258Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021117.xml (deflated 45%) 2022-11-23T04:12:06.2634045Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021126.xml (deflated 46%) 2022-11-23T04:12:06.2634904Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021134.xml (deflated 46%) 2022-11-23T04:12:06.2635784Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021143.xml (deflated 46%) 2022-11-23T04:12:06.2636567Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021152.xml (deflated 45%) 2022-11-23T04:12:06.2637330Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021201.xml (deflated 45%) 2022-11-23T04:12:06.2638109Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021209.xml (deflated 46%) 2022-11-23T04:12:06.2638902Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021218.xml (deflated 44%) 2022-11-23T04:12:06.2639692Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021227.xml (deflated 44%) 2022-11-23T04:12:06.2640468Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021236.xml (deflated 42%) 2022-11-23T04:12:06.2641253Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021245.xml (deflated 42%) 2022-11-23T04:12:06.2642033Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021255.xml (deflated 42%) 2022-11-23T04:12:06.2642817Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021304.xml (deflated 45%) 2022-11-23T04:12:06.2643649Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021313.xml (deflated 44%) 2022-11-23T04:12:06.2644446Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021323.xml (deflated 41%) 2022-11-23T04:12:06.2645222Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021332.xml (deflated 44%) 2022-11-23T04:12:06.2646006Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021342.xml (deflated 41%) 2022-11-23T04:12:06.2646773Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021352.xml (deflated 41%) 2022-11-23T04:12:06.2647558Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021359.xml (deflated 41%) 2022-11-23T04:12:06.2648345Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021408.xml (deflated 41%) 2022-11-23T04:12:06.2649134Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021417.xml (deflated 42%) 2022-11-23T04:12:06.2649919Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021427.xml (deflated 42%) 2022-11-23T04:12:06.2650687Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021432.xml (deflated 42%) 2022-11-23T04:12:06.2651466Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021436.xml (deflated 42%) 2022-11-23T04:12:06.2652248Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021440.xml (deflated 42%) 2022-11-23T04:12:06.2653035Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021444.xml (deflated 42%) 2022-11-23T04:12:06.2653806Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021448.xml (deflated 41%) 2022-11-23T04:12:06.2654588Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021452.xml (deflated 42%) 2022-11-23T04:12:06.2655435Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021502.xml (deflated 41%) 2022-11-23T04:12:06.2656215Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021512.xml (deflated 42%) 2022-11-23T04:12:06.2656977Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021521.xml (deflated 41%) 2022-11-23T04:12:06.2657762Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021531.xml (deflated 41%) 2022-11-23T04:12:06.2658541Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021542.xml (deflated 41%) 2022-11-23T04:12:06.2659319Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021556.xml (deflated 41%) 2022-11-23T04:12:06.2660085Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021603.xml (deflated 42%) 2022-11-23T04:12:06.2660869Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021613.xml (deflated 42%) 2022-11-23T04:12:06.2661653Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021623.xml (deflated 41%) 2022-11-23T04:12:06.2662484Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021632.xml (deflated 42%) 2022-11-23T04:12:06.2663256Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021642.xml (deflated 42%) 2022-11-23T04:12:06.2664397Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021654.xml (deflated 42%) 2022-11-23T04:12:06.2665203Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021700.xml (deflated 41%) 2022-11-23T04:12:06.2665990Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021709.xml (deflated 42%) 2022-11-23T04:12:06.2666754Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021720.xml (deflated 43%) 2022-11-23T04:12:06.2667540Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021730.xml (deflated 42%) 2022-11-23T04:12:06.2668326Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021740.xml (deflated 43%) 2022-11-23T04:12:06.2669121Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021805.xml (deflated 44%) 2022-11-23T04:12:06.2669884Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021815.xml (deflated 42%) 2022-11-23T04:12:06.2670664Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021825.xml (deflated 42%) 2022-11-23T04:12:06.2671443Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021831.xml (deflated 41%) 2022-11-23T04:12:06.2672226Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021841.xml (deflated 40%) 2022-11-23T04:12:06.2673014Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021850.xml (deflated 41%) 2022-11-23T04:12:06.2673781Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20221123021901.xml (deflated 41%) 2022-11-23T04:12:06.2674701Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021911.xml (deflated 40%) 2022-11-23T04:12:06.2675460Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021918.xml (deflated 41%) 2022-11-23T04:12:06.2676207Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021936.xml (deflated 42%) 2022-11-23T04:12:06.2676939Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123021939.xml (deflated 41%) 2022-11-23T04:12:06.2677718Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022000.xml (deflated 41%) 2022-11-23T04:12:06.2678485Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022022.xml (deflated 42%) 2022-11-23T04:12:06.2679289Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022044.xml (deflated 41%) 2022-11-23T04:12:06.2680058Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022105.xml (deflated 42%) 2022-11-23T04:12:06.2680815Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20221123022107.xml (deflated 41%) 2022-11-23T04:12:06.2681710Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022137.xml (deflated 42%) 2022-11-23T04:12:06.2682747Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022145.xml (deflated 42%) 2022-11-23T04:12:06.2683701Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022153.xml (deflated 44%) 2022-11-23T04:12:06.2684649Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclProcessGroupWithDispatchedCollectivesTests-20221123022201.xml (deflated 42%) 2022-11-23T04:12:06.2685530Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLNoGPUTest-20221123022209.xml (deflated 41%) 2022-11-23T04:12:06.2686294Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022211.xml (deflated 40%) 2022-11-23T04:12:06.2687019Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022220.xml (deflated 39%) 2022-11-23T04:12:06.2687828Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022229.xml (deflated 39%) 2022-11-23T04:12:06.2688564Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022238.xml (deflated 39%) 2022-11-23T04:12:06.2689302Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022248.xml (deflated 39%) 2022-11-23T04:12:06.2690031Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022257.xml (deflated 39%) 2022-11-23T04:12:06.2690767Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022306.xml (deflated 39%) 2022-11-23T04:12:06.2691500Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022316.xml (deflated 39%) 2022-11-23T04:12:06.2692230Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022324.xml (deflated 39%) 2022-11-23T04:12:06.2692950Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022333.xml (deflated 39%) 2022-11-23T04:12:06.2693686Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022348.xml (deflated 39%) 2022-11-23T04:12:06.2694485Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022356.xml (deflated 39%) 2022-11-23T04:12:06.2695214Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022405.xml (deflated 39%) 2022-11-23T04:12:06.2695933Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022413.xml (deflated 39%) 2022-11-23T04:12:06.2696668Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022423.xml (deflated 39%) 2022-11-23T04:12:06.2697403Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022432.xml (deflated 38%) 2022-11-23T04:12:06.2698134Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022440.xml (deflated 39%) 2022-11-23T04:12:06.2698856Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022449.xml (deflated 39%) 2022-11-23T04:12:06.2699598Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20221123022503.xml (deflated 39%) 2022-11-23T04:12:06.2700324Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-RendezvousEnvTest-20221123022512.xml (deflated 40%) 2022-11-23T04:12:06.2701016Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-TimeoutTest-20221123022516.xml (deflated 40%) 2022-11-23T04:12:06.2701869Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022534.xml (deflated 43%) 2022-11-23T04:12:06.2702808Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022538.xml (deflated 43%) 2022-11-23T04:12:06.2703732Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022543.xml (deflated 44%) 2022-11-23T04:12:06.2704894Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022548.xml (deflated 41%) 2022-11-23T04:12:06.2705736Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022558.xml (deflated 42%) 2022-11-23T04:12:06.2706557Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022608.xml (deflated 43%) 2022-11-23T04:12:06.2707401Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022617.xml (deflated 41%) 2022-11-23T04:12:06.2708236Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022627.xml (deflated 42%) 2022-11-23T04:12:06.2709066Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022636.xml (deflated 41%) 2022-11-23T04:12:06.2709890Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022646.xml (deflated 41%) 2022-11-23T04:12:06.2710716Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022656.xml (deflated 42%) 2022-11-23T04:12:06.2711545Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022715.xml (deflated 42%) 2022-11-23T04:12:06.2712374Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022725.xml (deflated 42%) 2022-11-23T04:12:06.2713180Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022735.xml (deflated 42%) 2022-11-23T04:12:06.2714104Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022744.xml (deflated 43%) 2022-11-23T04:12:06.2714935Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022754.xml (deflated 42%) 2022-11-23T04:12:06.2715757Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022804.xml (deflated 42%) 2022-11-23T04:12:06.2716563Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022813.xml (deflated 42%) 2022-11-23T04:12:06.2717390Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022823.xml (deflated 42%) 2022-11-23T04:12:06.2718217Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022833.xml (deflated 42%) 2022-11-23T04:12:06.2718969Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022849.xml (deflated 39%) 2022-11-23T04:12:06.2719636Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022853.xml (deflated 39%) 2022-11-23T04:12:06.2720323Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022857.xml (deflated 39%) 2022-11-23T04:12:06.2721006Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-FileStoreTest-20221123022901.xml (deflated 39%) 2022-11-23T04:12:06.2721740Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-HashStoreTest-20221123022905.xml (deflated 39%) 2022-11-23T04:12:06.2722417Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-HashStoreTest-20221123022909.xml (deflated 40%) 2022-11-23T04:12:06.2723128Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PrefixFileStoreTest-20221123022914.xml (deflated 40%) 2022-11-23T04:12:06.2723853Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PrefixFileStoreTest-20221123022918.xml (deflated 40%) 2022-11-23T04:12:06.2724565Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PrefixStoreTest-20221123022922.xml (deflated 40%) 2022-11-23T04:12:06.2725261Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PrefixTCPStoreTest-20221123022924.xml (deflated 40%) 2022-11-23T04:12:06.2725978Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PrefixTCPStoreTest-20221123022928.xml (deflated 39%) 2022-11-23T04:12:06.2726685Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-PythonStoreTest-20221123022932.xml (deflated 39%) 2022-11-23T04:12:06.2727390Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousEnvTest-20221123022936.xml (deflated 39%) 2022-11-23T04:12:06.2728090Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousFileTest-20221123022940.xml (deflated 40%) 2022-11-23T04:12:06.2728862Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousFileTest-20221123022945.xml (deflated 39%) 2022-11-23T04:12:06.2729577Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022949.xml (deflated 39%) 2022-11-23T04:12:06.2730277Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022953.xml (deflated 39%) 2022-11-23T04:12:06.2730963Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123022957.xml (deflated 39%) 2022-11-23T04:12:06.2731677Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTCPTest-20221123023001.xml (deflated 40%) 2022-11-23T04:12:06.2732379Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTest-20221123023015.xml (deflated 38%) 2022-11-23T04:12:06.2733134Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-RendezvousTest-20221123023019.xml (deflated 39%) 2022-11-23T04:12:06.2733803Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023024.xml (deflated 39%) 2022-11-23T04:12:06.2734480Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023028.xml (deflated 39%) 2022-11-23T04:12:06.2735155Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023032.xml (deflated 38%) 2022-11-23T04:12:06.2735826Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023036.xml (deflated 38%) 2022-11-23T04:12:06.2736488Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023040.xml (deflated 38%) 2022-11-23T04:12:06.2737156Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023044.xml (deflated 39%) 2022-11-23T04:12:06.2737837Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023048.xml (deflated 38%) 2022-11-23T04:12:06.2738503Z adding: test/test-reports/python-unittest/distributed.test_store/TEST-TCPStoreTest-20221123023054.xml (deflated 38%) 2022-11-23T04:12:06.2739237Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023106.xml (deflated 41%) 2022-11-23T04:12:06.2740051Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023113.xml (deflated 41%) 2022-11-23T04:12:06.2740922Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023120.xml (deflated 40%) 2022-11-23T04:12:06.2741746Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023128.xml (deflated 40%) 2022-11-23T04:12:06.2742536Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023137.xml (deflated 40%) 2022-11-23T04:12:06.2743343Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023144.xml (deflated 41%) 2022-11-23T04:12:06.2744416Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023151.xml (deflated 40%) 2022-11-23T04:12:06.2745311Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023159.xml (deflated 40%) 2022-11-23T04:12:06.2746135Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123023208.xml (deflated 40%) 2022-11-23T04:12:06.2746941Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023215.xml (deflated 40%) 2022-11-23T04:12:06.2747724Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023222.xml (deflated 40%) 2022-11-23T04:12:06.2748532Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023230.xml (deflated 39%) 2022-11-23T04:12:06.2749342Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023238.xml (deflated 40%) 2022-11-23T04:12:06.2750149Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123023248.xml (deflated 39%) 2022-11-23T04:12:06.2750998Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20221123023306.xml (deflated 41%) 2022-11-23T04:12:06.2751969Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023316.xml (deflated 41%) 2022-11-23T04:12:06.2754195Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023326.xml (deflated 41%) 2022-11-23T04:12:06.2755478Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20221123023336.xml (deflated 41%) 2022-11-23T04:12:06.2756371Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023345.xml (deflated 41%) 2022-11-23T04:12:06.2757225Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023354.xml (deflated 41%) 2022-11-23T04:12:06.2758117Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023404.xml (deflated 41%) 2022-11-23T04:12:06.2758998Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20221123023411.xml (deflated 41%) 2022-11-23T04:12:06.2759851Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20221123023420.xml (deflated 40%) 2022-11-23T04:12:06.2760681Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023432.xml (deflated 40%) 2022-11-23T04:12:06.2761537Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023444.xml (deflated 40%) 2022-11-23T04:12:06.2762482Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023455.xml (deflated 40%) 2022-11-23T04:12:06.2763365Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023506.xml (deflated 39%) 2022-11-23T04:12:06.2764258Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023518.xml (deflated 40%) 2022-11-23T04:12:06.2765137Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023530.xml (deflated 40%) 2022-11-23T04:12:06.2766004Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023543.xml (deflated 40%) 2022-11-23T04:12:06.2766863Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20221123023556.xml (deflated 39%) 2022-11-23T04:12:06.2767785Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023609.xml (deflated 42%) 2022-11-23T04:12:06.2768727Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023623.xml (deflated 42%) 2022-11-23T04:12:06.2769691Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023638.xml (deflated 42%) 2022-11-23T04:12:06.2770655Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023653.xml (deflated 43%) 2022-11-23T04:12:06.2771685Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023706.xml (deflated 44%) 2022-11-23T04:12:06.2772612Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023719.xml (deflated 43%) 2022-11-23T04:12:06.2773563Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023731.xml (deflated 44%) 2022-11-23T04:12:06.2774516Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023744.xml (deflated 43%) 2022-11-23T04:12:06.2775537Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023757.xml (deflated 43%) 2022-11-23T04:12:06.2776467Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023810.xml (deflated 43%) 2022-11-23T04:12:06.2777406Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023818.xml (deflated 43%) 2022-11-23T04:12:06.2778356Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023825.xml (deflated 43%) 2022-11-23T04:12:06.2779297Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023833.xml (deflated 43%) 2022-11-23T04:12:06.2780234Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023840.xml (deflated 43%) 2022-11-23T04:12:06.2781156Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023849.xml (deflated 43%) 2022-11-23T04:12:06.2782100Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023858.xml (deflated 42%) 2022-11-23T04:12:06.2783106Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023913.xml (deflated 43%) 2022-11-23T04:12:06.2784358Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023917.xml (deflated 42%) 2022-11-23T04:12:06.2785303Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023940.xml (deflated 43%) 2022-11-23T04:12:06.2786254Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123023956.xml (deflated 42%) 2022-11-23T04:12:06.2787202Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024011.xml (deflated 43%) 2022-11-23T04:12:06.2788158Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024019.xml (deflated 42%) 2022-11-23T04:12:06.2789105Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024030.xml (deflated 42%) 2022-11-23T04:12:06.2790034Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024041.xml (deflated 42%) 2022-11-23T04:12:06.2790988Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024052.xml (deflated 43%) 2022-11-23T04:12:06.2791939Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024057.xml (deflated 43%) 2022-11-23T04:12:06.2792882Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024110.xml (deflated 43%) 2022-11-23T04:12:06.2793813Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024124.xml (deflated 42%) 2022-11-23T04:12:06.2794764Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024137.xml (deflated 42%) 2022-11-23T04:12:06.2795805Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024150.xml (deflated 43%) 2022-11-23T04:12:06.2796752Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024204.xml (deflated 42%) 2022-11-23T04:12:06.2797801Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024217.xml (deflated 42%) 2022-11-23T04:12:06.2798737Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024231.xml (deflated 43%) 2022-11-23T04:12:06.2799685Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024245.xml (deflated 42%) 2022-11-23T04:12:06.2800641Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024258.xml (deflated 42%) 2022-11-23T04:12:06.2801583Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024312.xml (deflated 42%) 2022-11-23T04:12:06.2802510Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024325.xml (deflated 42%) 2022-11-23T04:12:06.2803521Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024338.xml (deflated 42%) 2022-11-23T04:12:06.2804477Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024352.xml (deflated 42%) 2022-11-23T04:12:06.2805429Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024405.xml (deflated 42%) 2022-11-23T04:12:06.2806379Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024419.xml (deflated 42%) 2022-11-23T04:12:06.2807305Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024432.xml (deflated 42%) 2022-11-23T04:12:06.2808247Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024443.xml (deflated 43%) 2022-11-23T04:12:06.2809202Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024456.xml (deflated 42%) 2022-11-23T04:12:06.2810143Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024507.xml (deflated 42%) 2022-11-23T04:12:06.2811079Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024518.xml (deflated 43%) 2022-11-23T04:12:06.2812021Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024532.xml (deflated 43%) 2022-11-23T04:12:06.2812956Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024545.xml (deflated 43%) 2022-11-23T04:12:06.2813893Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024553.xml (deflated 43%) 2022-11-23T04:12:06.2814818Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024600.xml (deflated 43%) 2022-11-23T04:12:06.2815821Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024608.xml (deflated 42%) 2022-11-23T04:12:06.2816759Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024616.xml (deflated 42%) 2022-11-23T04:12:06.2817701Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024625.xml (deflated 42%) 2022-11-23T04:12:06.2818637Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024635.xml (deflated 42%) 2022-11-23T04:12:06.2819560Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024645.xml (deflated 42%) 2022-11-23T04:12:06.2820493Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024654.xml (deflated 42%) 2022-11-23T04:12:06.2821432Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024704.xml (deflated 42%) 2022-11-23T04:12:06.2822365Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024714.xml (deflated 42%) 2022-11-23T04:12:06.2823284Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024724.xml (deflated 42%) 2022-11-23T04:12:06.2824568Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024737.xml (deflated 42%) 2022-11-23T04:12:06.2825566Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024751.xml (deflated 43%) 2022-11-23T04:12:06.2826512Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024758.xml (deflated 43%) 2022-11-23T04:12:06.2827450Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024811.xml (deflated 42%) 2022-11-23T04:12:06.2828370Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024830.xml (deflated 42%) 2022-11-23T04:12:06.2829361Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024849.xml (deflated 42%) 2022-11-23T04:12:06.2830302Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024856.xml (deflated 43%) 2022-11-23T04:12:06.2831232Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024907.xml (deflated 42%) 2022-11-23T04:12:06.2832143Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024915.xml (deflated 42%) 2022-11-23T04:12:06.2833134Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024922.xml (deflated 42%) 2022-11-23T04:12:06.2834471Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024936.xml (deflated 42%) 2022-11-23T04:12:06.2836182Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123024952.xml (deflated 42%) 2022-11-23T04:12:06.2837712Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025008.xml (deflated 42%) 2022-11-23T04:12:06.2838796Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025021.xml (deflated 42%) 2022-11-23T04:12:06.2839976Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025042.xml (deflated 42%) 2022-11-23T04:12:06.2841815Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025105.xml (deflated 43%) 2022-11-23T04:12:06.2843451Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025110.xml (deflated 42%) 2022-11-23T04:12:06.2845321Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025133.xml (deflated 42%) 2022-11-23T04:12:06.2847007Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025154.xml (deflated 42%) 2022-11-23T04:12:06.2848886Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025214.xml (deflated 42%) 2022-11-23T04:12:06.2850111Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025234.xml (deflated 42%) 2022-11-23T04:12:06.2851171Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025253.xml (deflated 42%) 2022-11-23T04:12:06.2852127Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025313.xml (deflated 42%) 2022-11-23T04:12:06.2853062Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025333.xml (deflated 42%) 2022-11-23T04:12:06.2854012Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025357.xml (deflated 42%) 2022-11-23T04:12:06.2854957Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025418.xml (deflated 42%) 2022-11-23T04:12:06.2855909Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025441.xml (deflated 42%) 2022-11-23T04:12:06.2856834Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20221123025452.xml (deflated 43%) 2022-11-23T04:12:06.2857801Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025505.xml (deflated 43%) 2022-11-23T04:12:06.2858800Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025516.xml (deflated 43%) 2022-11-23T04:12:06.2859786Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20221123025528.xml (deflated 43%) 2022-11-23T04:12:06.2860682Z adding: test/test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20221123025546.xml (deflated 68%) 2022-11-23T04:12:06.2861511Z adding: test/test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint/TEST-TestDistributedCheckpoint-20221123025557.xml (deflated 59%) 2022-11-23T04:12:06.2862284Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_apply/TEST-TestApply-20221123025614.xml (deflated 60%) 2022-11-23T04:12:06.2863105Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint/TEST-TestFSDPCheckpoint-20221123025637.xml (deflated 90%) 2022-11-23T04:12:06.2864180Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20221123025806.xml (deflated 55%) 2022-11-23T04:12:06.2865147Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20221123025842.xml (deflated 91%) 2022-11-23T04:12:06.2865933Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123025933.xml (deflated 91%) 2022-11-23T04:12:06.2866677Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123030151.xml (deflated 79%) 2022-11-23T04:12:06.2867377Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123030151.xml (deflated 64%) 2022-11-23T04:12:06.2868078Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123030151.xml (deflated 61%) 2022-11-23T04:12:06.2868818Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123030151.xml (deflated 91%) 2022-11-23T04:12:06.2869580Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20221123031225.xml (deflated 83%) 2022-11-23T04:12:06.2870349Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params/TEST-TestFlattenParams-20221123031314.xml (deflated 77%) 2022-11-23T04:12:06.2871217Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20221123031401.xml (deflated 84%) 2022-11-23T04:12:06.2872002Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_fx/TEST-TestSymbolicTracing-20221123031458.xml (deflated 45%) 2022-11-23T04:12:06.2872741Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc/TEST-TestGradAcc-20221123031512.xml (deflated 93%) 2022-11-23T04:12:06.2873528Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20221123031621.xml (deflated 75%) 2022-11-23T04:12:06.2874260Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123031655.xml (deflated 57%) 2022-11-23T04:12:06.2874981Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20221123031713.xml (deflated 55%) 2022-11-23T04:12:06.2875749Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20221123031739.xml (deflated 86%) 2022-11-23T04:12:06.2876497Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20221123031820.xml (deflated 77%) 2022-11-23T04:12:06.2877283Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionSharded-20221123031940.xml (deflated 92%) 2022-11-23T04:12:06.2878172Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionUnsharded-20221123031940.xml (deflated 63%) 2022-11-23T04:12:06.2879009Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20221123032516.xml (deflated 41%) 2022-11-23T04:12:06.2879815Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_wrapping/TEST-TestMultipleWrapping-20221123032530.xml (deflated 47%) 2022-11-23T04:12:06.2880614Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20221123032545.xml (deflated 93%) 2022-11-23T04:12:06.2881422Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeOne-20221123033028.xml (deflated 43%) 2022-11-23T04:12:06.2882281Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeTwo-20221123033028.xml (deflated 43%) 2022-11-23T04:12:06.2883338Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123033051.xml (deflated 56%) 2022-11-23T04:12:06.2884165Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_sharded_grad_scaler/TEST-TestShardGradScaler-20221123033106.xml (deflated 64%) 2022-11-23T04:12:06.2885085Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_sharded_grad_scaler/TEST-TestShardedGradScalerParityWithDDP-20221123033106.xml (deflated 83%) 2022-11-23T04:12:06.2885989Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20221123033156.xml (deflated 95%) 2022-11-23T04:12:06.2886842Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20221123034015.xml (deflated 91%) 2022-11-23T04:12:06.2887749Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20221123034015.xml (deflated 43%) 2022-11-23T04:12:06.2888624Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration/TEST-TestTPFSDPIntegration-20221123034406.xml (deflated 79%) 2022-11-23T04:12:06.2889447Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal/TEST-TestTraversal-20221123034443.xml (deflated 41%) 2022-11-23T04:12:06.2890259Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20221123034456.xml (deflated 41%) 2022-11-23T04:12:06.2891192Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsFQNs-20221123034511.xml (deflated 54%) 2022-11-23T04:12:06.2892161Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsMultipleParamGroups-20221123034511.xml (deflated 83%) 2022-11-23T04:12:06.2893161Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsParamAccess-20221123034511.xml (deflated 45%) 2022-11-23T04:12:06.2894134Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsUnshardReshard-20221123034511.xml (deflated 76%) 2022-11-23T04:12:06.2895098Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_use_orig_params/TEST-TestFSDPUseOrigParamsWriteback-20221123034511.xml (deflated 64%) 2022-11-23T04:12:06.2895986Z adding: test/test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestGetSubmoduleToStates-20221123034835.xml (deflated 43%) 2022-11-23T04:12:06.2896749Z adding: test/test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestUtils-20221123034835.xml (deflated 69%) 2022-11-23T04:12:06.2897492Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20221123034844.xml (deflated 81%) 2022-11-23T04:12:06.2898248Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20221123034844.xml (deflated 89%) 2022-11-23T04:12:06.2899097Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_checkpoint/TEST-TestDistributedCheckpointing-20221123035042.xml (deflated 55%) 2022-11-23T04:12:06.2899978Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_checkpoint/TEST-TestDistributedFailure-20221123035042.xml (deflated 75%) 2022-11-23T04:12:06.2900909Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20221123035123.xml (deflated 67%) 2022-11-23T04:12:06.2901888Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20221123035123.xml (deflated 43%) 2022-11-23T04:12:06.2902960Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20221123035123.xml (deflated 44%) 2022-11-23T04:12:06.2904279Z adding: test/test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestCustomShardingSpec-20221123035155.xml (deflated 65%) 2022-11-23T04:12:06.2905180Z adding: test/test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestShardingSpec-20221123035155.xml (deflated 79%) 2022-11-23T04:12:06.2905982Z adding: test/test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20221123035235.xml (deflated 71%) 2022-11-23T04:12:06.2906866Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20221123035307.xml (deflated 44%) 2022-11-23T04:12:06.2907768Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20221123035323.xml (deflated 43%) 2022-11-23T04:12:06.2908609Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20221123035323.xml (deflated 57%) 2022-11-23T04:12:06.2909408Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20221123035323.xml (deflated 54%) 2022-11-23T04:12:06.2910212Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20221123035323.xml (deflated 54%) 2022-11-23T04:12:06.2911091Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20221123035323.xml (deflated 58%) 2022-11-23T04:12:06.2911883Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20221123035323.xml (deflated 57%) 2022-11-23T04:12:06.2912715Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20221123035323.xml (deflated 84%) 2022-11-23T04:12:06.2913586Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20221123035323.xml (deflated 65%) 2022-11-23T04:12:06.2914465Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20221123035323.xml (deflated 83%) 2022-11-23T04:12:06.2915357Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20221123035323.xml (deflated 78%) 2022-11-23T04:12:06.2916281Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20221123035323.xml (deflated 58%) 2022-11-23T04:12:06.2917161Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20221123035323.xml (deflated 44%) 2022-11-23T04:12:06.2917994Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20221123040520.xml (deflated 59%) 2022-11-23T04:12:06.2918827Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk/TEST-TestShardedTensorChunkOps-20221123040539.xml (deflated 56%) 2022-11-23T04:12:06.2919717Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops/TEST-TestShardedTensorElementWiseOps-20221123040558.xml (deflated 71%) 2022-11-23T04:12:06.2920600Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding/TEST-TestShardedEmbedding-20221123040625.xml (deflated 57%) 2022-11-23T04:12:06.2921453Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20221123040645.xml (deflated 58%) 2022-11-23T04:12:06.2922406Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20221123040705.xml (deflated 70%) 2022-11-23T04:12:06.2923246Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20221123040810.xml (deflated 67%) 2022-11-23T04:12:06.2924101Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear/TEST-TestShardedTensorOpsLinear-20221123040901.xml (deflated 65%) 2022-11-23T04:12:06.2924981Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123040934.xml (deflated 83%) 2022-11-23T04:12:06.2925823Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123041032.xml (deflated 56%) 2022-11-23T04:12:06.2926621Z adding: test/test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123041050.xml (deflated 51%) 2022-11-23T04:12:06.2927450Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20221123041104.xml (deflated 63%) 2022-11-23T04:12:06.2928248Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20221123041104.xml (deflated 58%) 2022-11-23T04:12:06.2929061Z adding: test/test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041146.xml (deflated 40%) 2022-11-23T04:12:06.2929824Z adding: test/test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041151.xml (deflated 40%) 2022-11-23T04:12:06.2930512Z adding: test/test-reports/python-unittest/test_cuda_primary_ctx/TEST-TestCudaPrimaryCtx-20221123041156.xml (deflated 40%) 2022-11-23T04:12:06.2974096Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T04:12:06.2974495Z # Remove any previous test reports if they exist 2022-11-23T04:12:06.2974811Z rm -f usage-log-*.zip 2022-11-23T04:12:06.2975184Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2022-11-23T04:12:06.2975574Z # so check to see if the file exists first 2022-11-23T04:12:06.2975865Z if [ -f 'usage_log.txt' ]; then 2022-11-23T04:12:06.2976198Z  zip "usage-log-${FILE_SUFFIX}.zip" 'usage_log.txt' 2022-11-23T04:12:06.2976492Z fi 2022-11-23T04:12:06.2988424Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:06.2988723Z env: 2022-11-23T04:12:06.2988969Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:06.2989218Z GPU_FLAG: --gpus all 2022-11-23T04:12:06.2989580Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:06.2990041Z FILE_SUFFIX: test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887 2022-11-23T04:12:06.2990373Z ##[endgroup] 2022-11-23T04:12:06.4392210Z adding: usage_log.txt (deflated 95%) 2022-11-23T04:12:06.4440204Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T04:12:06.4440499Z with: 2022-11-23T04:12:06.4440765Z s3-prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:06.4441073Z retention-days: 14 2022-11-23T04:12:06.4441351Z if-no-files-found: warn 2022-11-23T04:12:06.4441608Z path: test-jsons-*.zip 2022-11-23T04:12:06.4441871Z name: artifact 2022-11-23T04:12:06.4442129Z s3-bucket: gha-artifacts 2022-11-23T04:12:06.4442375Z region: us-east-1 2022-11-23T04:12:06.4442614Z env: 2022-11-23T04:12:06.4442861Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:06.4443123Z GPU_FLAG: --gpus all 2022-11-23T04:12:06.4443494Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:06.4443850Z ##[endgroup] 2022-11-23T04:12:06.9091888Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T04:12:06.9092370Z With the provided path, there will be 1 file uploaded 2022-11-23T04:12:06.9093071Z Uploading to s3 prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:06.9104290Z Starting upload of test-jsons-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:07.0735600Z Finished upload of test-jsons-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:07.0902882Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T04:12:07.0903173Z with: 2022-11-23T04:12:07.0903433Z s3-prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:07.0903726Z retention-days: 14 2022-11-23T04:12:07.0904392Z if-no-files-found: error 2022-11-23T04:12:07.0904673Z path: test-reports-*.zip 2022-11-23T04:12:07.0904923Z name: artifact 2022-11-23T04:12:07.0905168Z s3-bucket: gha-artifacts 2022-11-23T04:12:07.0905408Z region: us-east-1 2022-11-23T04:12:07.0905635Z env: 2022-11-23T04:12:07.0905868Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:07.0906112Z GPU_FLAG: --gpus all 2022-11-23T04:12:07.0906485Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:07.0906834Z ##[endgroup] 2022-11-23T04:12:07.5471325Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T04:12:07.5471701Z With the provided path, there will be 1 file uploaded 2022-11-23T04:12:07.5472067Z Uploading to s3 prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:07.5483563Z Starting upload of test-reports-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:07.7454118Z Finished upload of test-reports-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:07.7615269Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T04:12:07.7615566Z with: 2022-11-23T04:12:07.7615831Z s3-prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:07.7616126Z retention-days: 14 2022-11-23T04:12:07.7616396Z if-no-files-found: ignore 2022-11-23T04:12:07.7616649Z path: usage-log-*.zip 2022-11-23T04:12:07.7616908Z name: artifact 2022-11-23T04:12:07.7617156Z s3-bucket: gha-artifacts 2022-11-23T04:12:07.7617395Z region: us-east-1 2022-11-23T04:12:07.7617619Z env: 2022-11-23T04:12:07.7617855Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:07.7618094Z GPU_FLAG: --gpus all 2022-11-23T04:12:07.7618453Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:07.7618799Z ##[endgroup] 2022-11-23T04:12:08.2159839Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T04:12:08.2160255Z With the provided path, there will be 1 file uploaded 2022-11-23T04:12:08.2160633Z Uploading to s3 prefix: pytorch/pytorch/3528394938/1/artifact 2022-11-23T04:12:08.2171925Z Starting upload of usage-log-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:08.5013911Z Finished upload of usage-log-test-multigpu-1-1-linux.16xlarge.nvidia.gpu_9655554887.zip 2022-11-23T04:12:08.5174582Z ##[group]Run # shellcheck disable=SC2156 2022-11-23T04:12:08.5174944Z # shellcheck disable=SC2156 2022-11-23T04:12:08.5175353Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2022-11-23T04:12:08.5189733Z shell: /usr/bin/bash -e {0} 2022-11-23T04:12:08.5189986Z env: 2022-11-23T04:12:08.5190224Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:08.5190639Z GPU_FLAG: --gpus all 2022-11-23T04:12:08.5190993Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:08.5191326Z ##[endgroup] 2022-11-23T04:12:08.8525154Z ##[group]Run set -x 2022-11-23T04:12:08.8525450Z set -x 2022-11-23T04:12:08.8525733Z python3 -m pip install -r requirements.txt 2022-11-23T04:12:08.8526050Z python3 -m pip install boto3==1.19.12 2022-11-23T04:12:08.8526590Z python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T04:12:08.8541714Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:08.8542310Z env: 2022-11-23T04:12:08.8542533Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:08.8542773Z GPU_FLAG: --gpus all 2022-11-23T04:12:08.8543114Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:08.8543463Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T04:12:08.8543694Z BRANCH: master 2022-11-23T04:12:08.8544280Z TEST_CONFIG: multigpu 2022-11-23T04:12:08.8544620Z SHARD_NUMBER: 1 2022-11-23T04:12:08.8544824Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T04:12:08.8545233Z PR_NUMBER: 2022-11-23T04:12:08.8545454Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T04:12:08.8545672Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T04:12:08.8545990Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T04:12:08.8546196Z TAG: 2022-11-23T04:12:08.8546400Z WORKFLOW_ID: 3528394938 2022-11-23T04:12:08.8546828Z GITHUB_TOKEN: *** 2022-11-23T04:12:08.8547074Z GHA_WORKFLOW_JOB_ID: 9655554887 2022-11-23T04:12:08.8547303Z ##[endgroup] 2022-11-23T04:12:08.8579659Z + python3 -m pip install -r requirements.txt 2022-11-23T04:12:09.1566950Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T04:12:09.2479535Z Collecting astunparse 2022-11-23T04:12:09.2697679Z Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) 2022-11-23T04:12:09.3066522Z Collecting expecttest 2022-11-23T04:12:09.3125240Z Downloading expecttest-0.1.4-py3-none-any.whl (6.5 kB) 2022-11-23T04:12:09.3560228Z Collecting future 2022-11-23T04:12:09.3605658Z Downloading future-0.18.2.tar.gz (829 kB) 2022-11-23T04:12:11.4234280Z Collecting hypothesis 2022-11-23T04:12:11.4359113Z Downloading hypothesis-6.58.0-py3-none-any.whl (396 kB) 2022-11-23T04:12:12.2956331Z Collecting numpy 2022-11-23T04:12:12.3004900Z Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) 2022-11-23T04:12:12.6535733Z Requirement already satisfied: psutil in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (5.9.1) 2022-11-23T04:12:12.7799716Z Collecting pyyaml 2022-11-23T04:12:12.7845355Z Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB) 2022-11-23T04:12:12.8059210Z Requirement already satisfied: requests in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (2.26.0) 2022-11-23T04:12:12.8246785Z Requirement already satisfied: setuptools in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (49.1.3) 2022-11-23T04:12:12.8878440Z Collecting six 2022-11-23T04:12:12.9102918Z Downloading six-1.16.0-py2.py3-none-any.whl (11 kB) 2022-11-23T04:12:12.9483036Z Collecting types-dataclasses 2022-11-23T04:12:12.9528383Z Downloading types_dataclasses-0.6.6-py3-none-any.whl (2.9 kB) 2022-11-23T04:12:12.9984273Z Collecting typing_extensions 2022-11-23T04:12:13.0024146Z Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB) 2022-11-23T04:12:13.0614023Z Collecting sympy 2022-11-23T04:12:13.0680293Z Downloading sympy-1.10.1-py3-none-any.whl (6.4 MB) 2022-11-23T04:12:13.2783123Z Collecting filelock 2022-11-23T04:12:13.2822037Z Downloading filelock-3.8.0-py3-none-any.whl (10 kB) 2022-11-23T04:12:13.3801917Z Collecting networkx 2022-11-23T04:12:13.3937239Z Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB) 2022-11-23T04:12:13.5380526Z Collecting jinja2 2022-11-23T04:12:13.5421159Z Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB) 2022-11-23T04:12:13.6442871Z Collecting wheel<1.0,>=0.23.0 2022-11-23T04:12:13.6483564Z Downloading wheel-0.38.4-py3-none-any.whl (36 kB) 2022-11-23T04:12:13.7022013Z Collecting attrs>=19.2.0 2022-11-23T04:12:13.7063537Z Downloading attrs-22.1.0-py2.py3-none-any.whl (58 kB) 2022-11-23T04:12:13.7861575Z Collecting exceptiongroup>=1.0.0; python_version < "3.11" 2022-11-23T04:12:13.7900569Z Downloading exceptiongroup-1.0.4-py3-none-any.whl (14 kB) 2022-11-23T04:12:13.8457074Z Collecting sortedcontainers<3.0.0,>=2.1.0 2022-11-23T04:12:13.8502659Z Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB) 2022-11-23T04:12:13.8610087Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (3.4) 2022-11-23T04:12:13.8624939Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2022.9.24) 2022-11-23T04:12:13.8636961Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (1.26.12) 2022-11-23T04:12:13.8871217Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2.0.12) 2022-11-23T04:12:13.9180130Z Collecting mpmath>=0.19 2022-11-23T04:12:13.9270512Z Downloading mpmath-1.2.1-py3-none-any.whl (532 kB) 2022-11-23T04:12:14.1012229Z Collecting MarkupSafe>=2.0 2022-11-23T04:12:14.1053876Z Downloading MarkupSafe-2.1.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB) 2022-11-23T04:12:14.1157651Z Using legacy 'setup.py install' for future, since package 'wheel' is not installed. 2022-11-23T04:12:14.3202866Z Installing collected packages: six, wheel, astunparse, expecttest, future, attrs, exceptiongroup, sortedcontainers, hypothesis, numpy, pyyaml, types-dataclasses, typing-extensions, mpmath, sympy, filelock, networkx, MarkupSafe, jinja2 2022-11-23T04:12:14.3639567Z WARNING: The script wheel is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T04:12:14.3640561Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T04:12:14.3986130Z Running setup.py install for future: started 2022-11-23T04:12:15.1094848Z Running setup.py install for future: finished with status 'done' 2022-11-23T04:12:15.4327770Z WARNING: The script hypothesis is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T04:12:15.4328458Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T04:12:17.5014187Z WARNING: The scripts f2py, f2py3 and f2py3.7 are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T04:12:17.5014942Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T04:12:26.6604326Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T04:12:26.6605005Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T04:12:27.9268623Z Successfully installed MarkupSafe-2.1.1 astunparse-1.6.3 attrs-22.1.0 exceptiongroup-1.0.4 expecttest-0.1.4 filelock-3.8.0 future-0.18.2 hypothesis-6.58.0 jinja2-3.1.2 mpmath-1.2.1 networkx-2.6.3 numpy-1.21.6 pyyaml-6.0 six-1.16.0 sortedcontainers-2.4.0 sympy-1.10.1 types-dataclasses-0.6.6 typing-extensions-4.4.0 wheel-0.38.4 2022-11-23T04:12:28.0094830Z + python3 -m pip install boto3==1.19.12 2022-11-23T04:12:28.3257739Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T04:12:29.3408550Z Collecting boto3==1.19.12 2022-11-23T04:12:29.3593068Z Downloading boto3-1.19.12-py3-none-any.whl (131 kB) 2022-11-23T04:12:29.4179257Z Collecting jmespath<1.0.0,>=0.7.1 2022-11-23T04:12:29.4222130Z Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB) 2022-11-23T04:12:29.4762099Z Collecting s3transfer<0.6.0,>=0.5.0 2022-11-23T04:12:29.4803373Z Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB) 2022-11-23T04:12:30.6591356Z Collecting botocore<1.23.0,>=1.22.12 2022-11-23T04:12:30.6675639Z Downloading botocore-1.22.12-py3-none-any.whl (8.1 MB) 2022-11-23T04:12:30.8781113Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.26.12) 2022-11-23T04:12:30.9504327Z Collecting python-dateutil<3.0.0,>=2.1 2022-11-23T04:12:30.9550736Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2022-11-23T04:12:30.9766625Z Requirement already satisfied: six>=1.5 in /home/ec2-user/.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.16.0) 2022-11-23T04:12:31.2007003Z Installing collected packages: jmespath, python-dateutil, botocore, s3transfer, boto3 2022-11-23T04:12:32.1603205Z Successfully installed boto3-1.19.12 botocore-1.22.12 jmespath-0.10.0 python-dateutil-2.8.2 s3transfer-0.5.2 2022-11-23T04:12:32.2247605Z + python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T04:12:33.8284202Z [scribe] Scribe access token not provided, sending report via boto3... 2022-11-23T04:12:33.8284506Z 2022-11-23T04:12:33.8285645Z ----- Historic stats comparison result ------ 2022-11-23T04:12:33.8285907Z 2022-11-23T04:12:33.8286068Z job: linux-bionic-cuda11.6-py3.9-gcc7 2022-11-23T04:12:33.8287224Z commit: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T04:12:33.8287375Z 2022-11-23T04:12:33.8287670Z Commit graph (base is most recent master ancestor with at least one S3 report): 2022-11-23T04:12:33.8287954Z 2022-11-23T04:12:33.8290462Z : (master) 2022-11-23T04:12:33.8290724Z | 2022-11-23T04:12:33.8291023Z * 1cfd3858ac (HEAD) total time 6920.81s 2022-11-23T04:12:33.8291346Z * 26322544b8 0 reports 2022-11-23T04:12:33.8291592Z * 7f4b4d2827 0 reports 2022-11-23T04:12:33.8295023Z * b50699f247 0 reports 2022-11-23T04:12:33.8295319Z * 8bf8e4d71e 0 reports 2022-11-23T04:12:33.8295608Z * ce856cee7e 0 reports 2022-11-23T04:12:33.8295854Z * 391b593ca2 0 reports 2022-11-23T04:12:33.8296044Z * 5bba783d21 0 reports 2022-11-23T04:12:33.8296305Z * ea920a1115 0 reports 2022-11-23T04:12:33.8299603Z * 74e62a1fef 0 reports 2022-11-23T04:12:33.8299816Z * 00b7d8ef23 0 reports 2022-11-23T04:12:33.8300078Z | 2022-11-23T04:12:33.8300298Z : 2022-11-23T04:12:33.8300419Z 2022-11-23T04:12:33.8300591Z Removed (across 0 suites) 0 tests, totaling 0.00s 2022-11-23T04:12:33.8300960Z Modified (across 0 suites) 0 tests, totaling 0.00s 2022-11-23T04:12:33.8301313Z Added (across 118 suites) 1139 tests, totaling +6920.81s 2022-11-23T04:12:33.8867984Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2022-11-23T04:12:33.8868321Z with: 2022-11-23T04:12:33.8868537Z env: 2022-11-23T04:12:33.8868772Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:33.8869019Z GPU_FLAG: --gpus all 2022-11-23T04:12:33.8869383Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:33.8869727Z ##[endgroup] 2022-11-23T04:12:33.8901314Z ##[group]Run set -eou pipefail 2022-11-23T04:12:33.8901615Z set -eou pipefail 2022-11-23T04:12:33.8901869Z  2022-11-23T04:12:33.8902180Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2022-11-23T04:12:33.8902490Z for _ in $(seq 1440); do 2022-11-23T04:12:33.8902784Z  # Break if no ssh session exists anymore 2022-11-23T04:12:33.8903071Z  if [ "$(who)" = "" ]; then 2022-11-23T04:12:33.8903297Z  break 2022-11-23T04:12:33.8903560Z  fi 2022-11-23T04:12:33.8903787Z  echo "." 2022-11-23T04:12:33.8904486Z  sleep 5 2022-11-23T04:12:33.8904721Z done 2022-11-23T04:12:33.8918730Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:33.8919010Z env: 2022-11-23T04:12:33.8919250Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:33.8919518Z GPU_FLAG: --gpus all 2022-11-23T04:12:33.8919866Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:33.8920342Z ##[endgroup] 2022-11-23T04:12:33.8952060Z Holding runner for 2 hours until all ssh sessions have logged out 2022-11-23T04:12:33.9040317Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T04:12:33.9040742Z # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T04:12:33.9041082Z # shellcheck disable=SC2046 2022-11-23T04:12:33.9041391Z docker stop $(docker ps -q) || true 2022-11-23T04:12:33.9041693Z # Prune all of the docker images 2022-11-23T04:12:33.9041992Z docker system prune -af 2022-11-23T04:12:33.9053613Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T04:12:33.9053892Z env: 2022-11-23T04:12:33.9054134Z GIT_DEFAULT_BRANCH: master 2022-11-23T04:12:33.9054404Z GPU_FLAG: --gpus all 2022-11-23T04:12:33.9054753Z DOCKER_CONTAINER_ID: 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:33.9055108Z ##[endgroup] 2022-11-23T04:12:35.6677466Z 4e69a3f2c44b 2022-11-23T04:12:36.1791863Z Deleted Containers: 2022-11-23T04:12:36.1792271Z 4e69a3f2c44bf2fb09616e0240efc76f4075b5fc109cf899979fb54cd5fa5968 2022-11-23T04:12:36.1792533Z 2022-11-23T04:12:41.8169954Z Deleted Images: 2022-11-23T04:12:41.8171108Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T04:12:41.8172272Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7@sha256:3a5626edfb2c43fb24303351be75287af92426b6bb7c6df2defc98f980346c6a 2022-11-23T04:12:41.8172886Z deleted: sha256:e2c63e8434298b5b8922fe396fb22d541e83da3321f8559334df676354c6a90a 2022-11-23T04:12:41.8173283Z deleted: sha256:e97e2654456ae35786d9ff4a73ece4d85ce36ae9bd4e402e5f8c4c41a4b8cb5d 2022-11-23T04:12:41.8173722Z deleted: sha256:0191afefc9967131b7cd6196bee5a1d3a4eba8c24d3e11ff67013ecd0d244f4d 2022-11-23T04:12:41.8174136Z deleted: sha256:cd6998962c740e934e511d315fd0139a2737289173123cd7675b630fe71d0a6f 2022-11-23T04:12:41.8174603Z deleted: sha256:684c9dbbc4faf4388438a99012caaa6e9e9c3ac93f3842ff7b2f4c81c6c66866 2022-11-23T04:12:41.8175080Z deleted: sha256:be75865fc66b386df8a53dd220b7f4fa8464d0c86f06b6fa84e7d5b8fa2b5333 2022-11-23T04:12:41.8175647Z deleted: sha256:9e5281171ccc5aa329fd085f38d4831c13f47e27ea26a9243daf336fc701114a 2022-11-23T04:12:41.8176056Z deleted: sha256:0ba6072392ef0b01b99d45293e62f415e397460b4bf5a00257afb7aa9cfccb14 2022-11-23T04:12:41.8177286Z deleted: sha256:5f0fab79723550908a4149737ce5268ceacba20bc9c1aea35acdb6ff93ba4aa7 2022-11-23T04:12:41.8177749Z deleted: sha256:0c8088138816657b983280a5e4385f5c159c90b6be095bc4972290be20d46c16 2022-11-23T04:12:41.8178156Z deleted: sha256:a9cdd96267ff8adf28efa06db7d37977216a7580ca475239528fff85024f9bcb 2022-11-23T04:12:41.8178614Z deleted: sha256:9abd11e0b20ee19055f20e11ac5a4cc029eee3433686ce8ab9ffb6636269391a 2022-11-23T04:12:41.8179057Z deleted: sha256:cb16cc59b9c802a04fe3283c4a00840d0a3c24128b3620964a7aa927a757d672 2022-11-23T04:12:41.8179503Z deleted: sha256:ed27e40372acea88785f25bcd63f03a56960f00e444e3d5b22e52915e885242b 2022-11-23T04:12:41.8179947Z deleted: sha256:395dfa2cf9efd2fde511c14dbaf706e2efb3ab003af0cd725614b86f10643247 2022-11-23T04:12:41.8180402Z deleted: sha256:ca415181cb076083a9af8e85b901ee24154183e2d4c3960e21aab48260376214 2022-11-23T04:12:41.8180824Z deleted: sha256:b13fc2861b47406c24208813cb5398b911d9bae952f11ed9a411f42e221f8dfc 2022-11-23T04:12:41.8181265Z deleted: sha256:9cbf0b121bab50c1cad2d31b40f6c7c52003ba77877a2ef6d9bc87a2c0b073d2 2022-11-23T04:12:41.8181734Z deleted: sha256:60e157b04ecdbe2bce04795e0fade9ec9aae999065bd410785dcbaedd9778a19 2022-11-23T04:12:41.8182171Z deleted: sha256:5eb96691864f520823a417cd2f3278b4c2ac579490941d6c623865e478828c8b 2022-11-23T04:12:41.8182583Z deleted: sha256:e93d6940ac64ac73f178cc63066fb2c3ab041023d66146b32019cb7860511be5 2022-11-23T04:12:41.8183014Z deleted: sha256:e302e1f04c7e3031f83227f08d6987b02f39a75ac0e741754afad2dc1e265f8a 2022-11-23T04:12:41.8183445Z deleted: sha256:d82cdf793dbcd047c1843326443a1249721e7308a7c6fb3e23fe7331652e7047 2022-11-23T04:12:41.8184529Z deleted: sha256:3edb430c2f9009d4993daf017be01fe272bd3452db11c16e51f7755ac845d410 2022-11-23T04:12:41.8185018Z deleted: sha256:16e8f362c1784e16c1db6b1d0aa4449097e6d646f4c8682a122dea7c4da38aaf 2022-11-23T04:12:41.8185438Z deleted: sha256:7f58576cf19df9f3be9082f2c0ec2fc7010409b97ecb99bae66a10805d752f48 2022-11-23T04:12:41.8185867Z deleted: sha256:88688611a15825ecae20cd8c4032711d2351d2f954a9ebcd4c671b2bdb017df8 2022-11-23T04:12:41.8186312Z deleted: sha256:a46e0b74ccdcd4e2eb07727be3bc1a2c4236b1f88c65e64a50234e8a35932a80 2022-11-23T04:12:41.8186698Z deleted: sha256:b633962159aa14dfe94a149d00f90eecaba6dab960d4011bdf3667a5ee9586db 2022-11-23T04:12:41.8187102Z deleted: sha256:a05c7409499ce8c5d7ffc085772c3910c812ec835dc9145bbbb07b8b3c075235 2022-11-23T04:12:41.8187544Z deleted: sha256:0d63a7de5066f69cd9fd1af8fc47405e880de8f88f5cb16278a1f1ac94d0cd41 2022-11-23T04:12:41.8187971Z deleted: sha256:7d74b4ce1a60334100fccc0917345873714e640160622691b579d64c0ae4640f 2022-11-23T04:12:41.8188377Z deleted: sha256:33aae29ffc4791507bef289cfb1f178909f3fc97a40c618723eeec1f8f5bd80c 2022-11-23T04:12:41.8188850Z deleted: sha256:6ea72b84f0436ed1d288baf124dd38e43bbb89e746ccfd3a4ec420ddced8bbc2 2022-11-23T04:12:41.8189284Z deleted: sha256:04e33e1cfdd5a1b2409b80f5881e6cf7b1810fe975aad4ce7c97b0ff6c0e7b4e 2022-11-23T04:12:41.8189729Z deleted: sha256:df1ef30e86bf04681ecd0728263efe1e98b2eea0a228cef29bd0febfc8bdac2f 2022-11-23T04:12:41.8190180Z deleted: sha256:36a44974e500014175f5e49f50c8afa1ac9c5e8092a8ea99c3c97b7ce9c517d8 2022-11-23T04:12:41.8190617Z deleted: sha256:a31f0224d50d031823b07dbb97f256f6960c87ea3c52ebeceef98febab200451 2022-11-23T04:12:41.8191061Z deleted: sha256:c4beec84548d277aff0487a9a5a8c2b3d577421e3275f36106b778c6edbb9d53 2022-11-23T04:12:41.8191472Z deleted: sha256:bcc7df3b45729f5d1802045954e76e3407d9e07ba6f516de0895d775d00ad7f8 2022-11-23T04:12:41.8191914Z deleted: sha256:84de992a179a16ba619507ec45b04b4c0da3d3fa31cedc8f6beb5aaadd7a232a 2022-11-23T04:12:41.8192357Z deleted: sha256:5011206a0b2edc2a6c68ba41313e7f283ee7c925ab6a731f8818d01352f68596 2022-11-23T04:12:41.8192771Z deleted: sha256:46a56b12ac94daa35c90ac97d26adfde704693e34613d69fb97687aa53ae33f5 2022-11-23T04:12:41.8193216Z deleted: sha256:ed2b7a9e28b3474bc9b7e68f8158ecda88b3fa3d3ab1587898fa976922af0deb 2022-11-23T04:12:41.8193659Z deleted: sha256:4a6976746db7764bb48f2a06af1fb5f88e3646edc1c9bc0d18686d5a6350cac0 2022-11-23T04:12:41.8194220Z deleted: sha256:5e175425e3e9ec93e8c6c1b7560b49ef5e95af68ec55757902072a8dca020323 2022-11-23T04:12:41.8194640Z deleted: sha256:fb740502513c6cf883c844f03760de367c4c70d09a69b9476bcf737b4578563a 2022-11-23T04:12:41.8195056Z deleted: sha256:2c105119fc030d11b3d570ec9a83948a1fb17f138df2a3245f9566b89de51495 2022-11-23T04:12:41.8195500Z deleted: sha256:8caad6b6cba0d0ced7e21fe4b2027b8647d66b7f78c34367dd8571a0520ba2c0 2022-11-23T04:12:41.8195942Z deleted: sha256:1051db32aefad193995ca536ed99e29eed4fd0340ddda721ec11e9c4eb9e93af 2022-11-23T04:12:41.8196387Z deleted: sha256:c6b2a4553f41b3b4a3dc6a26be0020c98980bb4e7186d194901769dce6716c27 2022-11-23T04:12:41.8196841Z deleted: sha256:8faec3528fe75bb31f14d0caf8707a2fe4b70f60d7e631c2b3dbb36cd6d83dd9 2022-11-23T04:12:41.8197288Z deleted: sha256:7574bc80094251ac667e6bed9dd5a808ecf6f61f23c8d4c56a69c644d06f4e32 2022-11-23T04:12:41.8197700Z deleted: sha256:69f57fbceb1b420d7e4697e0f6514887b0805ee0059bea7d51e0a832962e74bf 2022-11-23T04:12:41.8197952Z 2022-11-23T04:12:41.8330987Z Total reclaimed space: 18.96GB 2022-11-23T04:12:41.8393123Z Post job cleanup. 2022-11-23T04:12:41.8433152Z Post job cleanup. 2022-11-23T04:12:41.9824524Z [command]/usr/bin/git version 2022-11-23T04:12:41.9887922Z git version 2.37.1 2022-11-23T04:12:41.9952883Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/807ca356-4d39-4225-b1f3-caea94711c23' before making global git config changes 2022-11-23T04:12:41.9953486Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T04:12:41.9959452Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T04:12:42.0020550Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T04:12:42.0095785Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T04:12:42.0452758Z Entering 'android/libs/fbjni' 2022-11-23T04:12:42.0499307Z Entering 'third_party/FP16' 2022-11-23T04:12:42.0546083Z Entering 'third_party/FXdiv' 2022-11-23T04:12:42.0592327Z Entering 'third_party/NNPACK' 2022-11-23T04:12:42.0639320Z Entering 'third_party/QNNPACK' 2022-11-23T04:12:42.0684972Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T04:12:42.0731986Z Entering 'third_party/XNNPACK' 2022-11-23T04:12:42.0793751Z Entering 'third_party/benchmark' 2022-11-23T04:12:42.0839725Z Entering 'third_party/cpuinfo' 2022-11-23T04:12:42.0886581Z Entering 'third_party/cub' 2022-11-23T04:12:42.0934007Z Entering 'third_party/cudnn_frontend' 2022-11-23T04:12:42.0987128Z Entering 'third_party/cutlass' 2022-11-23T04:12:42.1040943Z Entering 'third_party/eigen' 2022-11-23T04:12:42.1095203Z Entering 'third_party/fbgemm' 2022-11-23T04:12:42.1143145Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T04:12:42.1193528Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T04:12:42.1242028Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T04:12:42.1292250Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T04:12:42.1342294Z Entering 'third_party/flatbuffers' 2022-11-23T04:12:42.1394702Z Entering 'third_party/fmt' 2022-11-23T04:12:42.1447279Z Entering 'third_party/foxi' 2022-11-23T04:12:42.1499298Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T04:12:42.1550576Z Entering 'third_party/gloo' 2022-11-23T04:12:42.1601537Z Entering 'third_party/googletest' 2022-11-23T04:12:42.1654003Z Entering 'third_party/ideep' 2022-11-23T04:12:42.1702199Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T04:12:42.1755295Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T04:12:42.1813004Z Entering 'third_party/ios-cmake' 2022-11-23T04:12:42.1862694Z Entering 'third_party/ittapi' 2022-11-23T04:12:42.1912192Z Entering 'third_party/kineto' 2022-11-23T04:12:42.1958820Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T04:12:42.2008171Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T04:12:42.2056886Z Entering 'third_party/nccl/nccl' 2022-11-23T04:12:42.2106441Z Entering 'third_party/neon2sse' 2022-11-23T04:12:42.2155620Z Entering 'third_party/nlohmann' 2022-11-23T04:12:42.2205379Z Entering 'third_party/onnx' 2022-11-23T04:12:42.2268171Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T04:12:42.2317569Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T04:12:42.2366601Z Entering 'third_party/onnx-tensorrt' 2022-11-23T04:12:42.2413410Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T04:12:42.2467713Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T04:12:42.2516219Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T04:12:42.2563491Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T04:12:42.2617519Z Entering 'third_party/pocketfft' 2022-11-23T04:12:42.2665682Z Entering 'third_party/protobuf' 2022-11-23T04:12:42.2720579Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T04:12:42.2768799Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T04:12:42.2818530Z Entering 'third_party/psimd' 2022-11-23T04:12:42.2869860Z Entering 'third_party/pthreadpool' 2022-11-23T04:12:42.2917121Z Entering 'third_party/pybind11' 2022-11-23T04:12:42.2963973Z Entering 'third_party/python-enum' 2022-11-23T04:12:42.3012007Z Entering 'third_party/python-peachpy' 2022-11-23T04:12:42.3059006Z Entering 'third_party/python-six' 2022-11-23T04:12:42.3105309Z Entering 'third_party/sleef' 2022-11-23T04:12:42.3150963Z Entering 'third_party/tbb' 2022-11-23T04:12:42.3200555Z Entering 'third_party/tensorpipe' 2022-11-23T04:12:42.3248864Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T04:12:42.3295948Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T04:12:42.3343023Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T04:12:42.3391041Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T04:12:42.3437627Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T04:12:42.3486287Z Entering 'third_party/zstd' 2022-11-23T04:12:42.3557517Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T04:12:42.3592069Z http.https://github.com/.extraheader 2022-11-23T04:12:42.3602026Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2022-11-23T04:12:42.3646965Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T04:12:42.4018566Z Entering 'android/libs/fbjni' 2022-11-23T04:12:42.4053263Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4091260Z Entering 'third_party/FP16' 2022-11-23T04:12:42.4119331Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4156220Z Entering 'third_party/FXdiv' 2022-11-23T04:12:42.4184151Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4220129Z Entering 'third_party/NNPACK' 2022-11-23T04:12:42.4246837Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4285128Z Entering 'third_party/QNNPACK' 2022-11-23T04:12:42.4312484Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4353850Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T04:12:42.4380652Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4420695Z Entering 'third_party/XNNPACK' 2022-11-23T04:12:42.4447798Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4497242Z Entering 'third_party/benchmark' 2022-11-23T04:12:42.4525628Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4561397Z Entering 'third_party/cpuinfo' 2022-11-23T04:12:42.4590368Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4627049Z Entering 'third_party/cub' 2022-11-23T04:12:42.4653213Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4696559Z Entering 'third_party/cudnn_frontend' 2022-11-23T04:12:42.4724849Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4768776Z Entering 'third_party/cutlass' 2022-11-23T04:12:42.4795769Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4843188Z Entering 'third_party/eigen' 2022-11-23T04:12:42.4870271Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4908729Z Entering 'third_party/fbgemm' 2022-11-23T04:12:42.4934121Z http.https://github.com/.extraheader 2022-11-23T04:12:42.4968533Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T04:12:42.5000522Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5037009Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T04:12:42.5067484Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5104581Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T04:12:42.5130639Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5168730Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T04:12:42.5195850Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5242084Z Entering 'third_party/flatbuffers' 2022-11-23T04:12:42.5270736Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5311244Z Entering 'third_party/fmt' 2022-11-23T04:12:42.5341819Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5379912Z Entering 'third_party/foxi' 2022-11-23T04:12:42.5407566Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5447881Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T04:12:42.5474246Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5511561Z Entering 'third_party/gloo' 2022-11-23T04:12:42.5538883Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5576038Z Entering 'third_party/googletest' 2022-11-23T04:12:42.5603878Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5643620Z Entering 'third_party/ideep' 2022-11-23T04:12:42.5669561Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5706391Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T04:12:42.5733712Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5775841Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T04:12:42.5804125Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5850345Z Entering 'third_party/ios-cmake' 2022-11-23T04:12:42.5879836Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5916365Z Entering 'third_party/ittapi' 2022-11-23T04:12:42.5946361Z http.https://github.com/.extraheader 2022-11-23T04:12:42.5982146Z Entering 'third_party/kineto' 2022-11-23T04:12:42.6008615Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6047078Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T04:12:42.6073855Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6113296Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T04:12:42.6140849Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6182394Z Entering 'third_party/nccl/nccl' 2022-11-23T04:12:42.6209784Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6248590Z Entering 'third_party/neon2sse' 2022-11-23T04:12:42.6276202Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6314362Z Entering 'third_party/nlohmann' 2022-11-23T04:12:42.6343199Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6382478Z Entering 'third_party/onnx' 2022-11-23T04:12:42.6410686Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6467351Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T04:12:42.6496513Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6536792Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T04:12:42.6563112Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6603111Z Entering 'third_party/onnx-tensorrt' 2022-11-23T04:12:42.6632991Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6669164Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T04:12:42.6695490Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6741108Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T04:12:42.6768202Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6808870Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T04:12:42.6836295Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6880276Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T04:12:42.6908347Z http.https://github.com/.extraheader 2022-11-23T04:12:42.6954289Z Entering 'third_party/pocketfft' 2022-11-23T04:12:42.6982883Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7023013Z Entering 'third_party/protobuf' 2022-11-23T04:12:42.7049943Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7092849Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T04:12:42.7119771Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7157764Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T04:12:42.7187777Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7247759Z Entering 'third_party/psimd' 2022-11-23T04:12:42.7274773Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7314366Z Entering 'third_party/pthreadpool' 2022-11-23T04:12:42.7344000Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7379539Z Entering 'third_party/pybind11' 2022-11-23T04:12:42.7406528Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7443362Z Entering 'third_party/python-enum' 2022-11-23T04:12:42.7472202Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7509363Z Entering 'third_party/python-peachpy' 2022-11-23T04:12:42.7535182Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7575698Z Entering 'third_party/python-six' 2022-11-23T04:12:42.7602023Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7638390Z Entering 'third_party/sleef' 2022-11-23T04:12:42.7665159Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7701431Z Entering 'third_party/tbb' 2022-11-23T04:12:42.7727687Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7765169Z Entering 'third_party/tensorpipe' 2022-11-23T04:12:42.7793099Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7827640Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T04:12:42.7852605Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7888405Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T04:12:42.7916193Z http.https://github.com/.extraheader 2022-11-23T04:12:42.7953537Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T04:12:42.7978641Z http.https://github.com/.extraheader 2022-11-23T04:12:42.8017903Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T04:12:42.8046402Z http.https://github.com/.extraheader 2022-11-23T04:12:42.8081743Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T04:12:42.8109828Z http.https://github.com/.extraheader 2022-11-23T04:12:42.8154267Z Entering 'third_party/zstd' 2022-11-23T04:12:42.8182673Z http.https://github.com/.extraheader 2022-11-23T04:12:42.8513298Z Cleaning up orphan processes